[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940330#comment-14940330
 ] 

Maryann Xue commented on PHOENIX-2270:
--

[~jnadeau] Implemented this in https://github.com/jacques-n/drill/pull/4, but 
not sure if the sort on the Drill side is a merge instead of real sort. Could 
you please verify?
The query plan is (also printed by the test case):
{code}
00-00Screen
00-01  Project(B=[$0], E1=[$1], E2=[$2], R=[$3])
00-02SelectionVectorRemover
00-03  Sort(sort0=[$1], dir0=[DESC])
00-04Project(B=[$0], E1=[$1], E2=[$2], R=[$3])
   PhoenixServerSort(sort0=[$1], dir0=[DESC])
 PhoenixTableScan(table=[[PHOENIX, A, BEER]], 
filter=[>=($1, 1)])
00-05  Phoenix
{
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ {
  "kind" : "LONG",
  "type" : "SESSION",
  "name" : "planner.width.max_per_node",
  "num_val" : 2
} ],
"queue" : 0,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "jdbc-scan",
"@id" : 0,
"scans" : [ 
"CgMKATASMQoNc2NhblByb2plY3RvchIgAP8EBAQBGAIBAhkCATACRTECGQIBMAJFMgIZAgEwAVISJQoKY29sdW1uSW5mbxIXBAFCAkUxAkUyAVISFwoSX05vbkFnZ3JlZ2F0ZVF1ZXJ5EgEBEiUKBV9Ub3BOEhyMAUAAAP8AAQAARRkCAP8EBAEABSQyLkUxKvABCilvcmcuYXBhY2hlLmhhZG9vcC5oYmFzZS5maWx0ZXIuRmlsdGVyTGlzdBLCAQgBElMKPG9yZy5hcGFjaGUucGhvZW5peC5maWx0ZXIuU2luZ2xlQ0ZDUUtleVZhbHVlQ29tcGFyaXNvbkZpbHRlchITFwQCAhkCATACRTEDBYEGAxJpCjBvcmcuYXBhY2hlLnBob2VuaXguZmlsdGVyLkNvbHVtblByb2plY3Rpb25GaWx0ZXISNQAAABUfiwgAMwAAId/b9AEBFR+LCAAzAAAh39v0AQAAMgwIABD//384AUAB"
 ],
"config" : {
  "type" : "phoenix",
  "url" : "jdbc:phoenix:localhost",
  "enabled" : true
},
"table" : "A.BEER",
"userName" : "",
"cost" : 0.0
  }, {
"pop" : "project",
"@id" : 4,
"exprs" : [ {
  "ref" : "`B`",
  "expr" : "`B`"
}, {
  "ref" : "`E1`",
  "expr" : "`E1`"
}, {
  "ref" : "`E2`",
  "expr" : "`E2`"
}, {
  "ref" : "`R`",
  "expr" : "`R`"
} ],
"child" : 0,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "external-sort",
"@id" : 3,
"child" : 4,
"orderings" : [ {
  "expr" : "`E1`",
  "order" : "DESC",
  "nullDirection" : "UNSPECIFIED"
} ],
"reverse" : false,
"initialAllocation" : 2000,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "selection-vector-remover",
"@id" : 2,
"child" : 3,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "project",
"@id" : 1,
"exprs" : [ {
  "ref" : "`B`",
  "expr" : "`B`"
}, {
  "ref" : "`E1`",
  "expr" : "`E1`"
}, {
  "ref" : "`E2`",
  "expr" : "`E2`"
}, {
  "ref" : "`R`",
  "expr" : "`R`"
} ],
"child" : 2,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  }, {
"pop" : "screen",
"@id" : 0,
"child" : 1,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 50.0
  } ]
}
{code} 

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort. One potential scenario that may be a win for Phoenix is if the rows are 
> already partially sorted and Phoenix can take advantage of this (which is not 
> currently the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2270) Implement Drill-specific rule for first level server-side sort

2015-10-01 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940626#comment-14940626
 ] 

Maryann Xue commented on PHOENIX-2270:
--

[~jamestaylor] Like your idea of perf measuring this, and we could probably 
find out which cases may benefit from Phoenix partial sort and which may not, 
and model the rels' cost accordingly.

> Implement Drill-specific rule for first level server-side sort
> --
>
> Key: PHOENIX-2270
> URL: https://issues.apache.org/jira/browse/PHOENIX-2270
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>  Labels: calcite, drill
>
> Phoenix should have a physical operator that executes a sort on the 
> server-side which Drill can leverage when re-ordering is necessary. Unlike 
> PHOENIX-2269 which is clearing going to be more efficient to let Phoenix 
> handle, the sort is more of a gray area. Phoenix will be faster in the way it 
> does the scan within the coprocessor, but it still needs to return the same 
> number of rows. This process puts a pretty heavy burden on the region server 
> as well. We should measure performance with and without Phoenix doing the 
> sort. One potential scenario that may be a win for Phoenix is if the rows are 
> already partially sorted and Phoenix can take advantage of this (which is not 
> currently the case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)