`

Oracle Dimension (Oracle 维)

阅读更多
转自:http://blog.csdn.net/wzy0623/archive/2009/11/13/4805860.aspx

  在数据仓库环境中,我们通常利用物化视图强大的查询重写功能来提升统计查询的性能,但是物化视图的查询重写功能有时候无法智能地判断查询中一些相关联的条件,以至于影响性能。比如我们有一张销售表sales,用于存储订单的详细信息,包含交易日期、顾客编号和销售量。我们创建一张物化视图,按月存储累计销量信息,假如这时候我们要查询按季度或者按年度统计销量信息,Oracle是否能够智能地转换查询重写呢?我们知道交易日期中的日期意味着月,月意味着所处的季度,季度意味着年度,但是Oracle却是无法智能地判断这其中的关系,因此无法利用物化视图查询重写来返回我们季度或年度的销量信息,而是直接查询基表,导致性能产生问题。

  这时候Dimension就派上用场了。Dimension用于说明列之间的父子对应关系,以使优化器能够自动转换不同列的关系,利用物化视图的查询功能来提升查询统计性能。下面我们首先创建一张销售交易表sales,包含交易日期、顾客编号和销售量这几个列,用于保存销售订单信息,整个表有42万多条记录;创建另一张表time_hierarchy用于存储交易日期中时间的关系,包含交易日期及其对应的月、季度及年度等信息,然后我们将体验Dimension的强大功能。

--建表
create table sales
(trans_date date, cust_id int, sales_amount number );

Table created.

--插入数据
insert /*+ APPEND */ into sales
select trunc(sysdate,'year')+mod(rownum,366) TRANS_DATE,
mod(rownum,100) CUST_ID,
abs(dbms_random.random)/100 SALES_AMOUNT
from all_objects;

5926 rows created.

commit;

Commit complete.

begin
for i in 1 .. 6
loop
insert /*+ APPEND */ into sales
select trans_date, cust_id, abs(dbms_random.random)/100 SALES_AMOUNT
from sales;
commit;
end loop;
end;
/

PL/SQL procedure successfully completed.

select count(*) from sales;

COUNT(*)
----------
426672


创建索引组织表time_hierarchy,里面生成了交易日期中日期DAY、月MMYYYY、季度QTY_YYYY、年度YYYY的关系。

create table time_hierarchy
(day primary key, mmyyyy, mon_yyyy, qtr_yyyy, yyyy)
organization index
as
select distinct
trans_date DAY,
cast (to_char(trans_date,'mmyyyy') as number) MMYYYY,
to_char(trans_date,'mon-yyyy') MON_YYYY,
'Q' || ceil( to_char(trans_date,'mm')/3) || ' FY'
|| to_char(trans_date,'yyyy') QTR_YYYY,
cast( to_char( trans_date, 'yyyy' ) as number ) YYYY
from sales
/

Table created.

  接下我们创建一张物化视图mv_sales,用于存储每个客户对应每个月的销量统计信息。

create materialized view mv_sales
build immediate
refresh on demand
enable query rewrite
as
select sales.cust_id, sum(sales.sales_amount) sales_amount,
time_hierarchy.mmyyyy
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by sales.cust_id, time_hierarchy.mmyyyy
/

Materialized view created.

我们对基表进行分析,以使优化器能够物化视图的查询重写功能:


analyze table sales compute statistics;

Table analyzed.

analyze table time_hierarchy compute statistics;

Table analyzed.

设置会话的查询重写功能:

alter session set query_rewrite_enabled=true;

Session altered.

alter session set query_rewrite_integrity=trusted;

Session altered.

接下来我们按月统计总的销量:


select time_hierarchy.mmyyyy, sum(sales_amount)
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by time_hierarchy.mmyyyy
/


MMYYYY SUM(SALES_AMOUNT)
---------- -----------------
12006 4.0574E+11
12007 1.2297E+10
22006 3.6875E+11
32006 3.9507E+11
42006 3.7621E+11
52006 3.8549E+11
62006 3.6641E+11
72006 3.8110E+11
82006 3.8502E+11
92006 3.7278E+11
102006 3.7983E+11
112006 3.7210E+11
122006 3.8364E+11

13 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=327 Bytes=8502)

1 0 SORT (GROUP BY) (Cost=4 Card=327 Bytes=8502)

2 1 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=327 Bytes=8502)

Statistics

----------------------------------------------------------

17 recursive calls

0 db block gets

25 consistent gets

4 physical reads


我们可以看到查询使用了查询重写的功能,直接查询物化视图中的查询方案,而不是查询其表,逻辑IO只有25个,性能相当良好。


假如这时候我们要按季度来查询统计销量信息,结果又会是怎样呢?


select time_hierarchy.qtr_yyyy, sum(sales_amount)

2 from sales, time_hierarchy

3 where sales.trans_date = time_hierarchy.day

4 group by time_hierarchy.qtr_yyyy

5 /


QTR_YYYY SUM(SALES_AMOUNT)

------------------------------------------------ -----------------

Q1 FY2006 1.1696E+12

Q1 FY2007 1.2297E+10

Q2 FY2006 1.1281E+12

Q3 FY2006 1.1389E+12

Q4 FY2006 1.1356E+12


Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1681 Card=5 Bytes=145)

1 0 SORT (GROUP BY) (Cost=1681 Card=5 Bytes=145)

2 1 NESTED LOOPS (Cost=35 Card=426672 Bytes=12373488)

3 2 TABLE ACCESS (FULL) OF 'SALES' (Cost=35 Card=426672

4 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_7828' (UNIQUE)


Statistics

----------------------------------------------------------

14 recursive calls

0 db block gets

428048 consistent gets

599 physical reads


可以看到查询将直接查询基表产生了将近428048个逻辑IO,性能无法满足需求。

接下我们创建一个Dimension表time_hierarchy_dim,用于提醒优化器time_hierarchy表中的DAY列暗示着MMYYYY,MMYYYY又意味着QTY_YYYY,QTY_YYYY又意味着YYYY。然后我们将重新运行上面那个查询,看执行计划发生了怎样的变更。


create dimension time_hierarchy_dim

2 level day is time_hierarchy.day

3 level mmyyyy is time_hierarchy.mmyyyy

4 level qtr_yyyy is time_hierarchy.qtr_yyyy

5 level yyyy is time_hierarchy.yyyy

6 hierarchy time_rollup

7 (

8 day child of

9 mmyyyy child of

10 qtr_yyyy child of

11 yyyy

12 )

13 attribute mmyyyy

14 determines mon_yyyy;


Dimension created.


select time_hierarchy.qtr_yyyy, sum(sales_amount)

2 from sales, time_hierarchy

3 where sales.trans_date = time_hierarchy.day

4 group by time_hierarchy.qtr_yyyy

5 /


QTR_YYYY SUM(SALES_AMOUNT)

------------------------------------------------ -----------------

Q1 FY2006 1.1696E+12

Q1 FY2007 1.2297E+10

Q2 FY2006 1.1281E+12

Q3 FY2006 1.1389E+12

Q4 FY2006 1.1356E+12


Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=14 Card=5 Bytes=195)

1 0 SORT (GROUP BY) (Cost=14 Card=5 Bytes=195)

2 1 HASH JOIN (Cost=7 Card=1157 Bytes=45123)

3 2 VIEW (Cost=4 Card=46 Bytes=598)

4 3 SORT (UNIQUE) (Cost=4 Card=46 Bytes=598)

5 4 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_7828' (UNIQUE)

6 2 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=327


Statistics

----------------------------------------------------------

193 recursive calls

0 db block gets

49 consistent gets

2 physical reads


可以看到创建Dimension后,Oracle已经能够智能地理解交易日期中月度和季度的转换关系,查询使用到物化视图,逻辑IO由原来的428048个减少到49个,性能有了大幅的提升。

同样我们再来统计一下年度的销量信息:


select time_hierarchy.yyyy, sum(sales_amount)

2 from sales, time_hierarchy

3 where sales.trans_date = time_hierarchy.day

4 group by time_hierarchy.yyyy

5 /


YYYY SUM(SALES_AMOUNT)

---------- -----------------

2006 4.5721E+12

2007 1.2297E+10


Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=2 Bytes=66)

1 0 SORT (GROUP BY) (Cost=10 Card=2 Bytes=66)

2 1 HASH JOIN (Cost=7 Card=478 Bytes=15774)



我们再创建一张customer_hierarchy表,用于存储客户代码、邮政编码和地区的关系,然后我们将按不同邮编或地区来查询各自的月度、季度或者年度销量信息。

create table customer_hierarchy

2 ( cust_id primary key, zip_code, region )

3 organization index

4 as

5 select cust_id,

6 mod( rownum, 6 ) || to_char(mod( rownum, 1000 ), 'fm0000') zip_code,

7 mod( rownum, 6 ) region

8 from ( select distinct cust_id from sales)

9 /

Table created.

analyze table customer_hierarchy compute statistics;

Table analyzed.

改写物化视图,查询方案中添加按不同邮编的月度统计销量。

drop materialized view mv_sales;

Materialized view dropped.

create materialized view mv_sales

2 build immediate

3 refresh on demand

4 enable query rewrite

5 as

6 select customer_hierarchy.zip_code,

7 time_hierarchy.mmyyyy,

8 sum(sales.sales_amount) sales_amount

9 from sales, time_hierarchy, customer_hierarchy

10 where sales.trans_date = time_hierarchy.day

11 and sales.cust_id = customer_hierarchy.cust_id

12 group by customer_hierarchy.zip_code, time_hierarchy.mmyyyy

13 /

Materialized view created.

set autotrace traceonly

select customer_hierarchy.zip_code,

2 time_hierarchy.mmyyyy,

3 sum(sales.sales_amount) sales_amount

4 from sales, time_hierarchy, customer_hierarchy

5 where sales.trans_date = time_hierarchy.day

6 and sales.cust_id = customer_hierarchy.cust_id

7 group by customer_hierarchy.zip_code, time_hierarchy.mmyyyy

8 /

1216 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=409 Bytes=20450)

1 0 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=409 Bytes=20450)

Statistics

----------------------------------------------------------

28 recursive calls

0 db block gets

116 consistent gets

5 physical reads

可以看到如果按不同邮编、不同月度来统计查询的话,优化器将会查询物化视图中的查询方案,性能也是比较可观的。假如我们查不同地区年度的统计销量信息,结果又会是怎样?

select customer_hierarchy.region,

2 time_hierarchy.yyyy,

3 sum(sales.sales_amount) sales_amount

4 from sales, time_hierarchy, customer_hierarchy

5 where sales.trans_date = time_hierarchy.day

6 and sales.cust_id = customer_hierarchy.cust_id

7 group by customer_hierarchy.region, time_hierarchy.yyyy

8 /

9 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1681 Card=9 Bytes=261)

1 0 SORT (GROUP BY) (Cost=1681 Card=9 Bytes=261)

2 1 NESTED LOOPS (Cost=35 Card=426672 Bytes=12373488)

3 2 NESTED LOOPS (Cost=35 Card=426672 Bytes=8106768)

4 3 TABLE ACCESS (FULL) OF 'SALES' (Cost=35 Card=426672

5 3 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_7833' (UNIQUE)

6 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_7828' (UNIQUE)

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

428047 consistent gets

745 physical reads

可以看到查询性能大有影响。接下我们同样创建dimension sales_dimension,用于说明客户代码和邮编、地区间的关系:

drop dimension time_hierarchy_dim

2 /

Dimension dropped.

create dimension sales_dimension

2 level cust_id is customer_hierarchy.cust_id

3 level zip_code is customer_hierarchy.zip_code

4 level region is customer_hierarchy.region

5 level day is time_hierarchy.day

6 level mmyyyy is time_hierarchy.mmyyyy

7 level qtr_yyyy is time_hierarchy.qtr_yyyy

8 level yyyy is time_hierarchy.yyyy

9 hierarchy cust_rollup

10 (

11 cust_id child of

12 zip_code child of

13 region

14 )

15 hierarchy time_rollup

16 (

17 day child of

18 mmyyyy child of

19 qtr_yyyy child of

20 yyyy

21 )

22 attribute mmyyyy

23 determines mon_yyyy;

Dimension created.

再回到原来的查询,我们可以看到查询性能有了大幅的提升:

set autotrace on

select customer_hierarchy.region,

2 time_hierarchy.yyyy,

3 sum(sales.sales_amount) sales_amount

4 from sales, time_hierarchy, customer_hierarchy

5 where sales.trans_date = time_hierarchy.day

6 and sales.cust_id = customer_hierarchy.cust_id

7 group by customer_hierarchy.region, time_hierarchy.yyyy

8 /

REGION YYYY SALES_AMOUNT

---------- ---------- ------------

0 2006 7.3144E+11

0 2007 4484956329

1 2006 7.8448E+11

2 2006 7.7257E+11

2 2007 4684418980

3 2006 7.7088E+11

4 2006 7.8004E+11

4 2007 3127953246

5 2006 7.3273E+11

9 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=15 Card=9 Bytes=576)

1 0 SORT (GROUP BY) (Cost=15 Card=9 Bytes=576)

2 1 HASH JOIN (Cost=10 Card=598 Bytes=38272)

3 2 VIEW (Cost=3 Card=100 Bytes=700)

4 3 SORT (UNIQUE) (Cost=3 Card=100 Bytes=700)

5 4 INDEX (FULL SCAN) OF 'SYS_IOT_TOP_7833' (UNIQUE)

6 2 HASH JOIN (Cost=7 Card=598 Bytes=34086)

7 6 VIEW (Cost=4 Card=19 Bytes=133)

8 7 SORT (UNIQUE) (Cost=4 Card=19 Bytes=133)

9 8 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_7828'

10 6 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=409

Statistics

----------------------------------------------------------

364 recursive calls

0 db block gets

88 consistent gets

0 physical reads

set autot trace

select customer_hierarchy.region,

2 time_hierarchy.qtr_yyyy,

3 sum(sales.sales_amount) sales_amount

4 from sales, time_hierarchy, customer_hierarchy

5 where sales.trans_date = time_hierarchy.day

6 and sales.cust_id = customer_hierarchy.cust_id

7 group by customer_hierarchy.region, time_hierarchy.qtr_yyyy;

27 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=23 Card=22 Bytes=154

1 0 SORT (GROUP BY) (Cost=23 Card=22 Bytes=1540)

2 1 HASH JOIN (Cost=11 Card=1447 Bytes=101290)

3 2 VIEW (Cost=3 Card=100 Bytes=700)

4 3 SORT (UNIQUE) (Cost=3 Card=100 Bytes=700)

5 4 INDEX (FULL SCAN) OF 'SYS_IOT_TOP_7833' (UNIQUE) (

6 2 HASH JOIN (Cost=7 Card=1447 Bytes=91161)

7 6 VIEW (Cost=4 Card=46 Bytes=598)

8 7 SORT (UNIQUE) (Cost=4 Card=46 Bytes=598)

9 8 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_7828' (UN

10 6 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=409 B

Statistics

----------------------------------------------------------

10 recursive calls

0 db block gets

19 consistent gets

0 physical reads

select customer_hierarchy.region,

2 time_hierarchy.mon_yyyy,

3 sum(sales.sales_amount) sales_amount

4 from sales, time_hierarchy, customer_hierarchy

5 where sales.trans_date = time_hierarchy.day

6 and sales.cust_id = customer_hierarchy.cust_id

7 group by customer_hierarchy.region, time_hierarchy.mon_yyyy;

75 rows selected.

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=41 Card=56 Bytes=386

1 0 SORT (GROUP BY) (Cost=41 Card=56 Bytes=3864)

2 1 HASH JOIN (Cost=11 Card=3775 Bytes=260475)

3 2 VIEW (Cost=4 Card=120 Bytes=1440)

4 3 SORT (UNIQUE) (Cost=4 Card=120 Bytes=1440)

5 4 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_7828' (UNIQ

6 2 HASH JOIN (Cost=6 Card=409 Bytes=23313)

7 6 VIEW (Cost=3 Card=100 Bytes=700)

8 7 SORT (UNIQUE) (Cost=3 Card=100 Bytes=700)

9 8 INDEX (FULL SCAN) OF 'SYS_IOT_TOP_7833' (UNIQUE)

10 6 TABLE ACCESS (FULL) OF 'MV_SALES' (Cost=2 Card=409 B

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

14 consistent gets

0 physical reads



参考:Tomates Kyte 《Expert One-on-One Oracle》
分享到:
评论

相关推荐

    图书馆图书管理系统 oracle数据库

    Dimension FrameSize=BorrowBookListFrame.getPreferredSize(); Dimension MainFrameSize=getSize(); Point loc=getLocation(); BorrowBookListFrame.setLocation((MainFrameSize.width-FrameSize.width...

    Oracle8i_9i数据库基础

    §8.4 关于维数(DIMENSION) 199 §8.4.1 CREATE DIMENSION语法 200 §8.4.2 创建维的例子 201 第九章 安全管理 203 §9.1 CREATE USER 命令 203 §9.2 建立用户 204 §9.2.1 外部验证(Authenticated )用户 204 ...

    oracle高级面试50问

    4. 解释data block , extent 和 ... 解答:Fact tables 和dimension tables. fact table 包含大量的主要的信息而 dimension tables 存放对fact table 某些属性描述的信息  10. FACT Table上需要建立何种索引?

    基于Java和Oracle实现的图书馆管理系统设计含全部java源码和SQL文件.rar

    table.setPreferredScrollableViewportSize(new Dimension(400,80)); JScrollPane s=new JScrollPane(table); panel2.add(s); c.add(panel1,BorderLayout.NORTH); c.add(panel2,BorderLayout.SOUTH); } ...

    oracle+面试宝典1000例

    压缩文件格式,里面体谅丰富,应该说很全面 9. 给出在STAR SCHEMA中的两种... 解答:Fact tables 和dimension tables. fact table 包含大量的主要的信息而 dimension tables 存放 对fact table 某些属性描述的信息

    Oracle Hyperion Essbase - Oracle Hyperion Essbase - 培训资料

    Oracle Hyperion Essbase - Oracle Hyperion Essbase - 培训资料 At the end of this lesson, you should be able to: Describe computational, reporting, and analytical capabilities defined in the database ...

    Oracle Essbase 11 Development Cookbook

    Table of Contents Preface Chapter 1: Understanding and Modifying Data Sources Chapter 2: Using Essbase Studio Chapter 3: Building the...http://www.packtpub.com/oracle-essbase-11-development-cookbook/book

    matlab导入excel代码-utl_max_values_from_a_three_dimensional_array:三维数组最后一维的

    matlab导入excel代码utl_max_values_from_a_three_Dimension_array 三维数组最后一维的最大值。 关键字:sas sql join合并大数据分析宏oracle teradata mysql sas社区stackoverflow statistics人工智慧AI Python R ...

    matlab导入excel代码-utl_convex_hull_polygons_encompassing_a_three_dimension

    matlab导入excel代码utl_convex_hull_polygons_encompassing_a_three_Dimension_scatter_plo 包含三维散点图的凸包面多边形。 关键字:sas sql join合并大数据分析宏oracle teradata mysql sas社区stackoverflow ...

    Pentaho_bi_server配置手册

    6.3 dimension、hierarchy和level 15 6.3.1 Degenerate dimensions(降解维度) 15 6.3.2 父子维度 16 6.4 star(星形)和snowflake(雪花)模型 18 6.4.1 Shared dimensions(共享维度) 19 7 Pentaho资源库及存储 ...

    matlab导入excel代码-utl_visualizing_suspicious_bivariate_outliers_with_2_di

    matlab导入excel代码utl_visualizing_suspicious_bivariate_outliers_with_2_Dimension_boxplots 使用二维箱形图可视化可疑的双变量离群值。 关键字:sas sql join合并大数据分析宏oracle teradata mysql sas社区...

    数据库基础

    §8.4 关于维数(DIMENSION) 199 §8.4.1 CREATE DIMENSION语法 200 §8.4.2 创建维的例子 201 第九章 安全管理 203 §9.1 CREATE USER 命令 203 §9.2 建立用户 204 §9.2.1 外部验证(Authenticated )用户 204 ...

    jpivot学习总结.doc

    <jp:mondrianQuery dataSource="" id="query01" jdbcDriver="oracle.jdbc.driver.OracleDriver" jdbcUrl="jdbc:oracle:thin:ngykt/ngyktadmin@172.16.46.241:1521:orcl10" catalogUri="/WEB-INF/queries/feeSchema....

    SogouExplorer.exe

    Level dimension: 0 Level storage version: 0x00000 - Unknown? Level weather: Rain time: 0 (now: false), thunder time: 0 (now: false) Level game mode: Game mode: creative (ID 1). Hardcore: false. ...

    ETL工具Kettle用户手册

    目录 Kettle 3.0 用户手册 ...................................................................................................................... 1 Kettle 3.0 用户手册 .....................................

Global site tag (gtag.js) - Google Analytics