How-to: Analytic Functions - Top n Ranking Functions

RANK and DENSE_RANK - Calculate the rank of a value in a group, e.g. Pete is the 14th best performing salesman in the Northern region.

The difference between RANK() and DENSE_RANK() is that RANK() will leave a gap in the ranking sequence when there are ties. e.g. rather than listing places 1,2,2,3 you would get 1,2,2,4

The largest rank value produced by DENSE_RANK equals the number of distinct values in the dataset.

The ordering sequence will affect the result:
DESC = Top n
ASC = Bottom n.

RANK as an analytic function

The function will compute the rank of each row returned from a query with respect to the other rows returned by the query, as defined by the order_by_clause. It is possible to have several RANK () OVER functions in a single SQL statement.

Syntax (analytic function)
   RANK () OVER
      ([PARTITION BY query_partition_clause] ORDER BY order_by_clause);

Example:
   SELECT department_id, last_name, salary,
   RANK() OVER (PARTITION BY department_id
                ORDER BY salary DESC) "Rank"
   FROM employees;

RANK as an aggregate function

When used in this way RANK will compute the rank of one row returned from a query with respect to the other rows returned by the query, as defined by the order_by_clause.

Syntax (aggregate function):
   RANK (expression) WITHIN GROUP
      (ORDER BY order_by_clause [ASC|DESC] [NULLS FIRST|LAST] );

Example:
   SELECT RANK(25000) WITHIN GROUP 
   (ORDER BY salary DESC) "Rank of 25000" 
   FROM employees;

CUME_DIST analytic function

This function computes the relative (fractional) position of one value among a group of rows.
It returns a decimal value between 0 and 1

Syntax (analytic function)
   CUME_DIST () OVER
      ([PARTITION BY query_partition_clause] ORDER BY order_by_clause);

Example:
   SELECT department_id, last_name, salary,
   CUME_DIST() OVER (PARTITION BY department_id
                ORDER BY salary DESC) "Cumulative Distribution "
   FROM employees;

CUME_DIST as an aggregate function

When used in this way CUME_DIST will compute the (fractional) position of one row returned from a query with respect to the other rows returned by the query, as defined by the order_by_clause.

Syntax (aggregate function):
   CUME_DIST (expression) WITHIN GROUP
      (ORDER BY order_by_clause [ASC|DESC] [NULLS FIRST|LAST] );

Example:
   SELECT CUME_DIST(25000) WITHIN GROUP 
   (ORDER BY salary DESC) "Cumulative Distribution of 25000" 
   FROM employees;

PERCENT_RANK function

This is identical to the CUME_DIST function above, but is calculated as a percentage rather than a fractional amount - It returns a decimal value between 0 and 1
The first row in an ASCending set will have a Percent_Rank of 0, this works in the same way as the Excel function of the same name.

NTILE function

Summary statistics - evenly divide a data set into thirds, fourths or any other grouping, so for example you could divide sales data into four groups and then analyse rank within the top group. This works just like football leagues - top of the second division league, third place in the first division league etc.
(For other ways of grouping data see also WIDTH_BUCKET and the CASE function)

Syntax (analytic function)
   NTILE ([expr]) OVER
      ([PARTITION BY query_partition_clause] ORDER BY order_by_clause);

Example:
   SELECT department_id, last_name, salary,
   NTILE() OVER (PARTITION BY department_id
                ORDER BY salary DESC) "Percentile Distribution "
   FROM employees;

ROW_NUMBER function

Not to be confused with the ROWNUM pseudocolumn, this function assigns a unique number to each row: determined by the ORDER BY clause

Syntax (analytic function)

   ROW_NUMBER () OVER
      ([PARTITION BY query_partition_clause] ORDER BY order_by_clause);

Examples:

   SELECT employee_name, salary,
     ROW_NUMBER() OVER (ORDER BY salary DESC NULLS LAST) As "MyNum"
   FROM employees;

From 9i upwards this also works in PL/SQL:

  declare
  cursor csr is select ename,deptno,
  rank() over (partition by deptno order by sal)
  as sal_seq
  from emp
  order by deptno;
  begin
  for c in csr
  loop
  dbms_output.put_line(c.ename||' '||c.deptno||'  '||c.sal_seq);
  end loop;
  end;
SQL> /
MILLER 10  1
CLARK 10  2
KING 10  3
SMITH 20  1
ADAMS 20  2
JONES 20  3
SCOTT 20  4
FORD 20  4
JAMES 30  1
MARTIN 30  2
WARD 30  2
TURNER 30  4
ALLEN 30  5
BLAKE 30  6

If the PARTITION BY clause is missing, the function will operate over the entire dataset.

“OS/2 will dominate the desktop within five years” ~ Gartner Group (1988)