经验首页 前端设计 程序设计 Java相关 移动开发 数据库/运维 软件/图像 大数据/云计算 其他经验
当前位置:技术经验 » 数据库运维 » MS SQL Server » 查看文章
如何使用SQL窗口子句减少语法开销
来源:cnblogs  作者:萤火的萤  时间:2018/10/26 9:59:39  对本文有异议

SQL是一种冗长的语言,其中最冗长的特性之一是窗口函数.

在.最近遇到的堆栈溢出问题,有人要求计算某一特定日期的时间序列中的第一个值和最后一个值之间的差额:

输入

 
  1. volume tstamp
 
  1. ---------------------------
 
  1. 29011 2012-12-28 09:00:00
 
  1. 28701 2012-12-28 10:00:00
 
  1. 28830 2012-12-28 11:00:00
 
  1. 28353 2012-12-28 12:00:00
 
  1. 28642 2012-12-28 13:00:00
 
  1. 28583 2012-12-28 14:00:00
 
  1. 28800 2012-12-29 09:00:00
 
  1. 28751 2012-12-29 10:00:00
 
  1. 28670 2012-12-29 11:00:00
 
  1. 28621 2012-12-29 12:00:00
 
  1. 28599 2012-12-29 13:00:00
 
  1. 28278 2012-12-29 14:00:00
 

期望输出

 
  1. first last difference date
 
  1. ------------------------------------
 
  1. 29011 28583 428 2012-12-28
 
  1. 28800 28278 522 2012-12-29
 

如何编写查询

请注意,值和时间戳级数可能不相关。所以,没有一条规定如果Timestamp2 > Timestamp1然后Value2 < Value1。否则,这个简单的查询就能工作(使用PostgreSQL语法):

 
  1. SELECT
 
  1. max(volume) AS first,
 
  1. min(volume) AS last,
 
  1. max(volume) - min(volume) AS difference,
 
  1. CAST(tstamp AS DATE) AS date
 
  1. FROM t
 
  1. GROUP BY CAST(tstamp AS DATE);
 

有几种方法可以在不涉及窗口函数的组中找到第一个和最后一个值。例如:

  • 在Oracle中,可以使用第一和最后函数,由于某些神秘原因,这些函数没有编写。FIRST(...) WITHIN GROUP (ORDER BY ...)LAST(...) WITHIN GROUP (ORDER BY ...),与其他排序集聚合函数一样,但是some_aggregate_function(...) KEEP (DENSE_RANK FIRST ORDER BY ...)。围棋数字
  • 在PostgreSQL中,可以使用DISTINCT ON语法与 ORDER BYLIMIT

有关各种方法的更多细节可以在这里找到:
https://blog.jooq.org/2017/09/22/how-to-write-efficient-top-n-queries-in-sql

最好的方法是使用像Oracle这样的聚合函数,但是很少有数据库具有这种功能。所以,我们将使用FIRST_VALUELAST_VALUE窗口函数:

 
  1. SELECT DISTINCT
 
  1. first_value(volume) OVER (
 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 
  1. ) AS first,
 
  1. last_value(volume) OVER (
 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 
  1. ) AS last,
 
  1. first_value(volume) OVER (
 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 
  1. )
 
  1. - last_value(volume) OVER (
 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 
  1. ) AS diff,
 
  1. CAST(tstamp AS DATE) AS date
 
  1. FROM t
 
  1. ORDER BY CAST(tstamp AS DATE)
 

哎呀。

看上去不太容易读。但它将产生正确的结果。当然,我们可以包装列的定义。FIRSTLAST在派生表中,但这仍然会给我们留下两次窗口定义的重复:

 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 

援救窗口条款

幸运的是,至少有3个数据库实现了sql标准。WINDOW条款:

  • MySQL
  • PostgreSQL
  • Sybase SQL Anywhere

上面的查询可以重构为这个查询:

 
  1. SELECT DISTINCT
 
  1. first_value(volume) OVER w AS first,
 
  1. last_value(volume) OVER w AS last,
 
  1. first_value(volume) OVER w
 
  1. - last_value(volume) OVER w AS diff,
 
  1. CAST(tstamp AS DATE) AS date
 
  1. FROM t
 
  1. WINDOW w AS (
 
  1. PARTITION BY CAST(tstamp AS DATE)
 
  1. ORDER BY tstamp
 
  1. ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
 
  1. )
 
  1. ORDER BY CAST(tstamp AS DATE)
 

请注意,如何使用窗口规范来指定窗口名称,就像定义公共表达式一样(WITH条款):

 
  1. WINDOW
 
  1. <window-name> AS (<window-specification>)
 
  1. { ,<window-name> AS (<window-specification>)... }
 

我不仅可以重用整个规范,还可以根据部分规范构建规范,并且只重用部分规范。我以前的查询可以这样重写:

 
  1. SELECT DISTINCT
 
  1. first_value(volume) OVER w3 AS first,
 
  1. last_value(volume) OVER w3 AS last,
 
  1. first_value(volume) OVER w3
 
  1. - last_value(volume) OVER w3 AS diff,
 
  1. CAST(tstamp AS DATE) AS date
 
  1. FROM t
 
  1. WINDOW
 
  1. w1 AS (PARTITION BY CAST(tstamp AS DATE)),
 
  1. w2 AS (w1 ORDER BY tstamp),
 
  1. w3 AS (w2 ROWS BETWEEN UNBOUNDED PRECEDING
 
  1. AND UNBOUNDED FOLLOWING)
 
  1. ORDER BY CAST(tstamp AS DATE)
 

每个窗口规范可以从头创建,也可以基于先前定义的窗口规范。注在引用窗口定义时也是如此。如果我想重用PARTITION BY条款和ORDER BY子句,但请更改FRAME条款(ROWS ...),那么我就可以这样写了:

 
  1. SELECT DISTINCT
 
  1. first_value(volume) OVER (
 
  1. w2 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
 
  1. ) AS first,
 
  1. last_value(volume) OVER (
 
  1. w2 ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
 
  1. ) AS last,
 
  1. first_value(volume) OVER (
 
  1. w2 ROWS UNBOUNDED PRECEDING
 
  1. ) - last_value(volume) OVER (
 
  1. w2 ROWS BETWEEN 1 PRECEDING AND UNBOUNDED FOLLOWING
 
  1. ) AS diff,
 
  1. CAST(tstamp AS DATE) AS date
 
  1. FROM t
 
  1. WINDOW
 
  1. w1 AS (PARTITION BY CAST(tstamp AS DATE)),
 
  1. w2 AS (w1 ORDER BY tstamp)
 
  1. ORDER BY CAST(tstamp AS DATE)
 友情链接:直通硅谷  点职佳  北美留学生论坛

本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728

W3xue 的所有内容仅供测试,对任何法律问题及风险不承担任何责任。通过使用本站内容随之而来的风险与本站无关。
关于我们  |  意见建议  |  捐助我们  |  报错有奖  |  广告合作、友情链接(目前9元/月)请联系QQ:27243702 沸活量
皖ICP备17017327号-2 皖公网安备34020702000426号