Statıstıcs 2 Dersi 7. Ünite Sorularla Öğrenelim
Açıköğretim ders notları öğrenciler tarafından ders çalışma esnasında hazırlanmakta olup diğer ders çalışacak öğrenciler için paylaşılmaktadır. Sizlerde hazırladığınız ders notlarını paylaşmak istiyorsanız bizlere iletebilirsiniz.
Açıköğretim derslerinden Statıstıcs 2 Dersi 7. Ünite Sorularla Öğrenelim için hazırlanan ders çalışma dokümanına (ders özeti / sorularla öğrenelim) aşağıdan erişebilirsiniz. AÖF Ders Notları ile sınavlara çok daha etkili bir şekilde çalışabilirsiniz. Sınavlarınızda başarılar dileriz.
Non-Parametric Statistics
Briefly explain the parametric tests.
Parametric tests used in statistics require to specify certain conditions or assumptions to be met in order to make decisions and estimates about population parameters using sample data. The reliability of a parametric test depends heavily on the validity of the assumptions. In addition, parametric tests require at least interval scaled measurements.
Briefly explain the term ‘non-parametric statistical test’.
A non-parametric statistical test does not specify certain conditions or assumptions to be met about population parameters using sample data. The assumptions used in non-parametric tests are that the observations in the sample are independent and the variable under study has underlying continuity. However, these assumptions are fewer and weaker than those used in parametric tests.
Specify the types of non-parametric statistics for one-sample problem.
1- Binominal Test
2- The Sign Test
3- The Chi-Square Goodness of Fit Test
4- The Kolmogorov-Smirnov Test
What is the binomial test used for?
The Binomial distribution is used when a probability of obtaining an outcome, out of two outcomes, on one trial is known and a probability of the number of times the same outcome might appear, say x times out of n independent trials is tried to be found. Mainly the binomial distribution is the sampling distribution of the proportions estimated from sample data. The binomial test tells us whether it is reasonable to conclude that the proportions we obtain from our sample could have been drawn from a population with a particular value of P.
What is ‘the sign test’ used for?
The sign test gets its name from the negative and positive signs of the mathematical difference of the observations’ measurements. It is used for the hypothesis test about the population median, M. Inferences about the population median are important when the population has highly skewed distribution. In skewed distributions, the population median is more centrally located than the population mean. Therefore, population median is accepted as better estimator for the measure of location if the population is skewed. The sign test is a non-parametric alternative test for one sample t test, paired sample t test, and a research in which quantitative measurements is impossible but in which it is possible to rank the measurements with respect to each other.
What is the ‘The Chi-Square Goodness of Fit Test’ used for?
Goodness of fit tests are used to determine whether a data set comes from a particular distribution. Some null and alternative hypotheses examples about the population distributions are given below.
H0 : The population distribution is Normal Distribution
H1 : The population distribution is not Normal Distribution
H0 : The population distribution is Poisson Distribution
H1 : The population distribution is not Poisson Distribution
The most commonly used goodness of fit test is the chi-square goodness of fit test.
The chi-square goodness of fit test is based on a comparison of the observed frequencies with the expected frequencies. First of all, the data is classified into k classes. For the ith class, the observed frequency is denoted by Oi and the expected frequency is denoted by Ei. Sum of the observed and expected frequencies must be equal to the sample size n as shown in the following equations.
What is the ‘The Kolmogorov-Smirnov Test’ used for?
The Kolmogorov-Smirnov test is another goodness of fit test. It is based on the agreement between the distribution of a set of observed values and some particular theoretical distribution. The KolmogorovSmirnov one sample test determines whether the scores in the sample come from a population with theoretical distribution. The test compares the observed cumulative frequency distribution with the cumulative frequency distribution which would occur under the null hypothesis. In Kolmogorov-Smirnov test, the level of measurement for data is at least ordinal. This is the power of Kolmogorov-Smirnov test.
In real life, it is not always possible that the population parameter follows a normal distribution. In this case, if you want to make a test, what would you do?
In these cases, we can base inferences on non-parametric tests that are valid over a wide range of distributions of the parent population.
What is the test to use for calculating non-parametric statistics for two sample problems?
In such cases, we can use Wilcoxon Rank Sum Test.
Provide information about the Wilcoxon Rank Sum Test.
The parametric independent samples t test is based on several assumptions: two samples are independent of each other, the populations in the investigation for t test are normally distributed, and two populations that these two samples are taken from has the same variability. The Wilcoxon rank sum test is a nonparametric alternative test that involves less assumptions. Especially it does not require that our two populations have a specific known distribution. As long as the populations in question are independent and has the same variability, you may use Wilcoxon Rank Sum test. Therefore, if you want to know if two samples come from independent populations, this method is a good choice. Wilcoxon rank sum test assumes that the two population distributions are identical, although they may differ in location parameter, such as one distribution may be shifted to the right or to the left of the other distribution.
What is the test to use for calculating non-parametric statistics for two sample dependent sample problems?
We can use Wilcoxon Signed Rank Test for these calculations.
Provide information about the Wilcoxon Signed Rank Test.
When we consider that the population of differences are symmetrical, a non-parametric test, which is called Wilcoxon signed rank, test is often more powerful than the sign test for making inferences about the population median differences MD. Hence, the main assumption of the test is that the population of differences are continuous and symmetrical. The Wilcoxon signed rank test, involves the usage of the sign and the magnitude of the rank of the differences between pairs of measurements, and it is a non-parametric alternative to the paired t test when the population distribution of the differences is not normally distributed. In this problem, the results are tied to each other, for example there is an effect, and outcomes of a variable are collected before the effect and after the effect on some observations. For example, a group of patients’ blood pressure can be measured before and after a medicine is taken. Wilcoxon Signed Rank Test does not require the normality assumption that we usually see on a parametric test.
What is the test to use for calculating non-parametric statistics for k independent sample problems?
We can use Kruskal-Wallis Test for these calculations.
Provide information on the Kruskal-Wallis Test.
The one-way analysis of variance has some assumptions such as independent samples, normality, and equal variances. When the conditions of normality and equal variances are not valid, Kruskal-Wallis is a non-parametric alternative test that involves less conditions. The Kruskal-Wallis test is used to decide whether k independent samples are drawn from different populations. It is an extension of the Wilcoxon sum rank test to a comparison of more than two populations. The Kruskal-Wallis test assumes that the variable under consideration is measured on at least ordinal scale.
How do we write the alternative hypotheses in Wilcoxon Signed Rank Test?
The hypotheses are in Wilcoxon Signed Rank Test are illustrated as follows.
H0: The distribution of differences is symmetrical around MD = 0
H1a: The differences tend to be larger than MD = 0
H1b: The differences tend to be smaller than MD = 0
H1c: The differences tend to be larger than MD = 0 or smaller than MD = 0
What is the test to use for calculating non-parametric statistics for measures of correlation?
We can use Spearman’s Rank Correlation Coefficient for these calculations.
Provide information about the Spearman’s Rank Correlation Coefficient.
Of all the statistical techniques involving ranks, Spearman’s rank correlation coefficient is the earliest one and today it is well known among the scientists. Spearman’s rank correlation coefficient is a measure of association for ranked data. In addition, it can be used when the relation between two variables is not linear. Spearman’s rank correlation coefficient takes the values between –1 and +1. The coefficient close to +1 indicates positive association between the variables. On the other hand, negative values close to –1 denotes negative association. A value near 0 means there is no association between these two variables.
Fill in the gap for the following sentence.
Spearman rank correlation coefficient takes
the values between ……………..
Spearman rank correlation coefficient takes
the values between -1 and +1.
Write down the null hypothesis constructed in the Kruskal-Wallis Test?
The null hypothesis constructed in the Kruskal-Wallis Test is “The k distributions are identical”.
Briefly explain the parametric tests.
Parametric tests used in statistics require to specify certain conditions or assumptions to be met in order to make decisions and estimates about population parameters using sample data. The reliability of a parametric test depends heavily on the validity of the assumptions. In addition, parametric tests require at least interval scaled measurements.
Briefly explain the term ‘non-parametric statistical test’.
A non-parametric statistical test does not specify certain conditions or assumptions to be met about population parameters using sample data. The assumptions used in non-parametric tests are that the observations in the sample are independent and the variable under study has underlying continuity. However, these assumptions are fewer and weaker than those used in parametric tests.
Specify the types of non-parametric statistics for one-sample problem.
1- Binominal Test
2- The Sign Test
3- The Chi-Square Goodness of Fit Test
4- The Kolmogorov-Smirnov Test
What is the binomial test used for?
The Binomial distribution is used when a probability of obtaining an outcome, out of two outcomes, on one trial is known and a probability of the number of times the same outcome might appear, say x times out of n independent trials is tried to be found. Mainly the binomial distribution is the sampling distribution of the proportions estimated from sample data. The binomial test tells us whether it is reasonable to conclude that the proportions we obtain from our sample could have been drawn from a population with a particular value of P.
What is ‘the sign test’ used for?
The sign test gets its name from the negative and positive signs of the mathematical difference of the observations’ measurements. It is used for the hypothesis test about the population median, M. Inferences about the population median are important when the population has highly skewed distribution. In skewed distributions, the population median is more centrally located than the population mean. Therefore, population median is accepted as better estimator for the measure of location if the population is skewed. The sign test is a non-parametric alternative test for one sample t test, paired sample t test, and a research in which quantitative measurements is impossible but in which it is possible to rank the measurements with respect to each other.
What is the ‘The Chi-Square Goodness of Fit Test’ used for?
Goodness of fit tests are used to determine whether a data set comes from a particular distribution. Some null and alternative hypotheses examples about the population distributions are given below.
H0 : The population distribution is Normal Distribution
H1 : The population distribution is not Normal Distribution
H0 : The population distribution is Poisson Distribution
H1 : The population distribution is not Poisson Distribution
The most commonly used goodness of fit test is the chi-square goodness of fit test.
The chi-square goodness of fit test is based on a comparison of the observed frequencies with the expected frequencies. First of all, the data is classified into k classes. For the ith class, the observed frequency is denoted by Oi and the expected frequency is denoted by Ei. Sum of the observed and expected frequencies must be equal to the sample size n as shown in the following equations.
What is the ‘The Kolmogorov-Smirnov Test’ used for?
The Kolmogorov-Smirnov test is another goodness of fit test. It is based on the agreement between the distribution of a set of observed values and some particular theoretical distribution. The KolmogorovSmirnov one sample test determines whether the scores in the sample come from a population with theoretical distribution. The test compares the observed cumulative frequency distribution with the cumulative frequency distribution which would occur under the null hypothesis. In Kolmogorov-Smirnov test, the level of measurement for data is at least ordinal. This is the power of Kolmogorov-Smirnov test.
In real life, it is not always possible that the population parameter follows a normal distribution. In this case, if you want to make a test, what would you do?
In these cases, we can base inferences on non-parametric tests that are valid over a wide range of distributions of the parent population.
What is the test to use for calculating non-parametric statistics for two sample problems?
In such cases, we can use Wilcoxon Rank Sum Test.
Provide information about the Wilcoxon Rank Sum Test.
The parametric independent samples t test is based on several assumptions: two samples are independent of each other, the populations in the investigation for t test are normally distributed, and two populations that these two samples are taken from has the same variability. The Wilcoxon rank sum test is a nonparametric alternative test that involves less assumptions. Especially it does not require that our two populations have a specific known distribution. As long as the populations in question are independent and has the same variability, you may use Wilcoxon Rank Sum test. Therefore, if you want to know if two samples come from independent populations, this method is a good choice. Wilcoxon rank sum test assumes that the two population distributions are identical, although they may differ in location parameter, such as one distribution may be shifted to the right or to the left of the other distribution.
What is the test to use for calculating non-parametric statistics for two sample dependent sample problems?
We can use Wilcoxon Signed Rank Test for these calculations.
Provide information about the Wilcoxon Signed Rank Test.
When we consider that the population of differences are symmetrical, a non-parametric test, which is called Wilcoxon signed rank, test is often more powerful than the sign test for making inferences about the population median differences MD. Hence, the main assumption of the test is that the population of differences are continuous and symmetrical. The Wilcoxon signed rank test, involves the usage of the sign and the magnitude of the rank of the differences between pairs of measurements, and it is a non-parametric alternative to the paired t test when the population distribution of the differences is not normally distributed. In this problem, the results are tied to each other, for example there is an effect, and outcomes of a variable are collected before the effect and after the effect on some observations. For example, a group of patients’ blood pressure can be measured before and after a medicine is taken. Wilcoxon Signed Rank Test does not require the normality assumption that we usually see on a parametric test.
What is the test to use for calculating non-parametric statistics for k independent sample problems?
We can use Kruskal-Wallis Test for these calculations.
Provide information on the Kruskal-Wallis Test.
The one-way analysis of variance has some assumptions such as independent samples, normality, and equal variances. When the conditions of normality and equal variances are not valid, Kruskal-Wallis is a non-parametric alternative test that involves less conditions. The Kruskal-Wallis test is used to decide whether k independent samples are drawn from different populations. It is an extension of the Wilcoxon sum rank test to a comparison of more than two populations. The Kruskal-Wallis test assumes that the variable under consideration is measured on at least ordinal scale.
How do we write the alternative hypotheses in Wilcoxon Signed Rank Test?
The hypotheses are in Wilcoxon Signed Rank Test are illustrated as follows.
H0: The distribution of differences is symmetrical around MD = 0
H1a: The differences tend to be larger than MD = 0
H1b: The differences tend to be smaller than MD = 0
H1c: The differences tend to be larger than MD = 0 or smaller than MD = 0
What is the test to use for calculating non-parametric statistics for measures of correlation?
We can use Spearman’s Rank Correlation Coefficient for these calculations.
Provide information about the Spearman’s Rank Correlation Coefficient.
Of all the statistical techniques involving ranks, Spearman’s rank correlation coefficient is the earliest one and today it is well known among the scientists. Spearman’s rank correlation coefficient is a measure of association for ranked data. In addition, it can be used when the relation between two variables is not linear. Spearman’s rank correlation coefficient takes the values between –1 and +1. The coefficient close to +1 indicates positive association between the variables. On the other hand, negative values close to –1 denotes negative association. A value near 0 means there is no association between these two variables.
Fill in the gap for the following sentence.
Spearman rank correlation coefficient takes
the values between ……………..
Spearman rank correlation coefficient takes
the values between -1 and +1.
Write down the null hypothesis constructed in the Kruskal-Wallis Test?
The null hypothesis constructed in the Kruskal-Wallis Test is “The k distributions are identical”.