Category Archives: Econometrics

Z-Test

Published by:

 

Z-test2

 

 

 

 

If  is true then:

z

 

This is our test statistic.

We reject H0 if the calculated value of our test statistic is less than -zα/2 or greater than +zα/2 (i.e., if it takes a value sufficiently far out in the tails of the standard normal distribution for us to think  is unlikely to be true).

 

Example:

The weights of fish in an aquaculture pond are considered to be normally distributed with a mean of 3.1Kg and a standard deviation of 1.1Kg. A random sample of size 30 is selected from the pond and the sample mean is found to be 2.37Kg. Is there sufficient evidence to indicate that the mean weight of the fish differs from 3.1Kg? Use a 10 level of significance.

hypoteses z test example

 

 

 

 

 Conclusion: The mean weight of the fish differs from 3.1Kg (at the 10% level of significance).

 

 

 

 

 

Hypothesis Testing

Published by:

 

  • A hypothesis is a statement (assumption) about a population parameter
    • population mean (Example: The mean monthly cell phone bill of this city is  μ = $42)
    • population proportion (Example: The proportion of adults in this city with cell phones is  π = 0.68)
  • Null Hypothesis
    • The hypothesis that assumes the status quo – that the old theory, method or standard is still true; the complement of the alternative hypothesis
    • Always contains ‘=‘ , ‘≤’ or ‘³’ sign
    • May or may not be rejected
    • Is always about a population parameter, ,not about a sample statistic
  • Alternative Hypothesis
    • The hypothesis that complements the null hypothesis.
    • Usually it is the hypothesis that the researcher is interested in proving
  • The Null and Alternative Hypotheses are mutually exclusive
    • e. only one of them can be true
  • The Null Hypothesis is assumed to be true
  • The burden of proof falls on the Alternative Hypothesis
  • Example: investigate if the mean monthly cell phone bill is $42
    • H0: μ = 42
    • H1: μ ≠ 42

Level of Significance and rejection region

 

 

 

 

 

 

 

 

 

 

 

Steps for the hypothesis test…

  1. State the null hypothesis, H0 and the alternative hypothesis, H1
  2. Choose the level of significance, a, and the sample size, n
  3. Determine the appropriate test statistic and sampling distribution
  4. Determine the critical values that divide the rejection and non-rejection regions
  1. Collect data and compute the value of the test statistic
  2. Make the statistical decision and state the managerial conclusion
  • If the test statistic falls into the non-rejection region, do not reject the null hypothesis H0.
  • If the test statistic falls into the rejection region, reject the null hypothesis
  • Express the managerial conclusion in the context of the real-world problem

 

  • p-value: Probability of obtaining a test statistic more extreme ( ≤ or ³ ) than the observed sample value given H0 is true
    • Also called observed level of significance
    • Smallest value of a  for which H0 can be rejected
    • Obtain the p-value from a table or computer
  • If p-value  <  a ,  reject H0
  • If p-value  ³  a ,  do not reject H0

Two populatio means tail test

 

 

 

 

 

 

 

 

 

 

 

Rules to follow:

Hypotheses:

Decision Rule:

Test Statistic:

Decision:

Conclusion:

Data Analysis for Economists – Part I

Published by:

Describing Data

Every economist needs to have the ability to collect, analyse, manipulate, understand and report data. In a daily research environment, we need to deal with randomness, variation and in order to apply our knowledge. Therefore we are going to summarize the most important and useful tools for every economist.

Key Definitions:

  • A population consists of all the members of a group about which you want to draw a conclusion. The size of the population depends on what you are interested in. (μ, σ, Ν)
  • A sample is the portion of the population selected for analysis. Collecting information on the population can be difficult and costly, therefore we sample. (x, s, n)
  • A parameter is a numerical measure that describes a characteristic of a population
  • A statistic is a numerical measure that describes a characteristic of a sample

A note on Notation

  • Greek letters (μ, σ, Ν) are used for population data
  • Roman letter (x, s, n) are used for sample data

 

Scatter diagrams are very common in econometrics and are used to examine possible relationships between two numerical variables.

  • In a scatter diagram one variable is measured on the vertical axis (Y) and the other variable is measured on the horizontal axis (X)
    • X = independent variable
    • Y = dependent variable

Scatter Plot

 

 

 

 

 

 

 

 

 

Figure 1: Plot A: Scatter Plot Relationship between Share of Food (WFOOD) and Total Expenditure (TOTEXP).

So, how do we actually describe our data?

Describing Data

 

We will need to know the data mean, median and mode, however, we will pretty much talk about the data Variation, Shape, Skewness, Range, Interquartile Range, Variance, Standard Deviation and Coefficient of Variation.

So, let’s start through the Central Tendency. What is the mean, median and mode?

Mean

  • Commonly called as the average
  • Calculated as the sum of values divided by the number of values
  • Affected by extreme values (outliers)
Population Mean Sample Mean
               μ           X

 

Median

  • In an ordered array, the median is the ‘middle’ number, not the average, but actually the physical position.
  • The location of the median: (n + 1) /2   is not the value of the median, only the position of the median in the ranked data

Rule 1: If the number of values in the data set is odd, the median is the middle ranked  value

Rule 2: If the number of values in the data set is even, the median is the mean (average) of the two middle ranked values

 

Mode

  • Value that occurs most often (the most frequent). It can be more than one value.

 

Quartiles

  • Quartiles split the ranked data into 4 segments with an equal number of values per segment
  • The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger

Q1 = (n+1)/4

  • The second quartile, Q2, is the same as the median (50% are smaller, 50% are larger)

Q2 = (n+1)/2

  • Only 25% of the observations are greater than the third quartile Q3

Q3 = 3(n+1)/4

 

Variation

Measures of variation give information on the spread or variability of the data values.

  • RANGE:

Difference between the largest and the smallest values in a set of data

Range = Xlargest - Xsmallest

 

  • INTERQUIRTELY RANGE:

Like the median and Q1 and Q2, the IQR is a resistant summary measure. It eliminates outlier problems by using the interquartile range as high- and low-valued observations are removed from calculations:

IQR = 3rd quartile – 1st quartile

 

  • VARIANCE: The mean squared deviation and it shows variation about the mean.

Advantages:

  • Each value in the data set is used in the calculation
  • Values far from the mean are given extra weight as deviations from the mean are squared

Disadvantage:

  • Sensitive to extreme values (outliers)
  • Measures of absolute variation not relative variation

 

Variance
The denominator (n-1) is to adjust for the biasness of the sample statistics.

 

Variation

  • COEFICIENT OF VARIATION:

Measures relative variation i.e. shows variation relative to mean. It can be used to compare two or more sets of data measured in different units and it is always expressed as percentage (%).

 

Shape and Skweness

  • Describes how data are distributed
  • Measures of shape – Symmetric or skewed

Shape of a Data

 

 

Sample Covariance

  • The sample covariance measures the direction of the linear relationship between two numerical variables (direction of the association)

Covariance

Sample Coefficient of the of Correlation r

  • Measures the relative strength of the linear relationship between two variables:

Correlation r

Where Sx and Sy are their Sample Variance.

 

Value of r Interpretation
r = -1 PERFECT negative linear relationship
-1 < r ≤ -0.7 STRONG negative linear relationship
-0.7 < r ≤ -0.3 MODERATE negative linear relationship
-0.3 < r < 0 WEAK negative linear relationship
r = 0 No relationship
0 < r < 0.3 WEAK positive linear relationship
0.3 ≤ r < 0.7 MODERATE positive linear relationship
0.7 ≤ r < 1 STRONG positive linear relationship
1 PERFECT positive linear relationship