[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144184#comment-16144184 ] Jon Haddad commented on CASSANDRA-8576: --- Unfortunately this patch is pretty stale now as 2.x is no longer getting feature improvements. Is there anything here that would be relevant for 4.0? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement >Reporter: Russell Spitzer >Assignee: Alex Liu > Fix For: 2.2.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v1-2.2-branch.txt, CASSANDRA-8576-v2-2.1-branch.txt, > CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605809#comment-14605809 ] Aleksey Yeschenko commented on CASSANDRA-8576: -- bq. (Edit: Piotr +1'd v3 already) Doh, you are right. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.2.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605803#comment-14605803 ] Aleksey Yeschenko commented on CASSANDRA-8576: -- Piotr's approval and +1 as the reviewer. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605771#comment-14605771 ] Jeremy Hanna commented on CASSANDRA-8576: - Is there anything else that needs to happen on this before committing? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569713#comment-14569713 ] Philip Thompson commented on CASSANDRA-8576: This does not break any of the existing pig tests. I ran some additional tests, and found no major issues. As far as a mixed version cluster, I spun up a 3 node cluster of C*, with two nodes running this patch, the third without. I connected Pig to the cluster, using the unmodified node as the initial address. I then performed some map reduce jobs to select data from the cluster. The jobs succeeded, and I did see any errors in the log. +1 from me. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564440#comment-14564440 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- Yes, it would be good to test it in a mixed version cluster. If cassandra.jar is part of the Hadoop job classpath, then there shouldn't be any problems. Problems might happen if cassandra.jar is on the classpath of Hadoop TT (inherited by all jobs), and different TTs used mixed versions of it (with / without this patch). > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561632#comment-14561632 ] Philip Thompson commented on CASSANDRA-8576: Reading [~jjordan]'s comment, does this need a test of a hadoop job while in a mixed version cluster? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539417#comment-14539417 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- [~jjordan] yes, you're right. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538280#comment-14538280 ] Jonathan Ellis commented on CASSANDRA-8576: --- So the ball is [~philipthompson]'s now for test? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt, CASSANDRA-8576-v3-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538044#comment-14538044 ] Alex Liu commented on CASSANDRA-8576: - It's no much different,but I will use your changes :) > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536484#comment-14536484 ] Jeremiah Jordan commented on CASSANDRA-8576: Bq. It looks better now, but the mixed-cluster during rolling upgrade issue is still there. If someone upgrades half of the cluster to the version with this patch, Hadoop jobs will very likely report errors (not sure how bad that will be - need to test it). This is only an issue if the jobs are pulling the C* jar off of the nodes and the jar isn't part of the job itself? So if this is a problem for someone, they have a work around. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536352#comment-14536352 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- It looks better now, but the mixed-cluster during rolling upgrade issue is still there. If someone upgrades half of the cluster to the version with this patch, Hadoop jobs will very likely report errors (not sure how bad that will be - need to test it). If this is not a problem, +1. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536341#comment-14536341 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- Some comments were not addressed. {noformat} boolean containToken; for (Range subrange : ranges) { //make sure subrange contains the token containToken = false; if (token != null) { if (subrange.contains(token)) containToken = true; else continue; } ColumnFamilySplit split = new ColumnFamilySplit( factory.toString(subrange.left), factory.toString(subrange.right), subSplit.getRow_count(), endpoints); if (containToken) split.setPartitionKeyEqQuery(containToken); logger.debug("adding {}", split); {noformat} Multiple code smells in this fragment: * boolean flag declared in a needlessly broad scope. If something is used only inside a loop, it should be declared only inside the loop. * continue controlled by a boolean flag * redundant if (the code is equivalent without if (containToken) I simplified it for you: {noformat} for (Range subrange : ranges) { boolean containsToken = token != null && subrange.contains(token); if (token == null || containsToken) { ColumnFamilySplit split = new ColumnFamilySplit( factory.toString(subrange.left), factory.toString(subrange.right), subSplit.getRow_count(), endpoints); split.setPartitionKeyEqQuery(containsToken); logger.debug("adding {}", split); splits.add(split); } } {noformat} > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, > CASSANDRA-8576-v2-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520395#comment-14520395 ] Alex Liu commented on CASSANDRA-8576: - if token == null, containToken is false. All other comments will be addressed in the new patch > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520112#comment-14520112 ] Jonathan Ellis commented on CASSANDRA-8576: --- 2.1.x means the next 2.1 release. (2.1.5 is already released.) > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520084#comment-14520084 ] Alex Liu commented on CASSANDRA-8576: - Which branch should this go into? Is it still going into 2.1.5 ? or other release? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519685#comment-14519685 ] Jonathan Ellis commented on CASSANDRA-8576: --- Alex, is this still on your radar? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.x > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496256#comment-14496256 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- I finished the review. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496197#comment-14496197 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- CqlTableTest L336 and L368 {noformat} count = 0; while (it.hasNext()) { it.next(); count ++; } {noformat} Use Guava Iterators.size(it). --- Code style issues: getToken, retrieveKeys: unused exceptions reported getToken: too big and too nested for my taste --- retrieveKeys L492: {noformat} CqlRow cqlRow = result.rows.get(0); {noformat} Will fail in a very cryptic way if the keyspace / table doesn't exist. It is good to give the user hints what went wrong. --- retrievKeys L503: {noformat} for (CfDef cfDef : ksDef.cf_defs) { if (cfDef.name.equalsIgnoreCase(cfName)) { CFMetaData cfMeta = ThriftConversion.fromThrift(cfDef); {noformat} Why equalsIgnoreCase? -- retrieveKeys L512: {noformat} return Pair.create(parseType(ByteBufferUtil.string(ByteBuffer.wrap(cqlRow.columns.get(1).getValue(, keys); {noformat} Code style: Expression too complex, too many nesting levels, hard to read. -- getToken L410: {noformat} int i = 0; {noformat} This should be declared in the first branch of the following if, because it is used only there, in order not to pollute the wider scope. -- {noformat} catch (Exception e) { //not a Terminal term } {noformat} Are you sure you really want to swallow all the exceptions here? Or did you have some specific exception in mind like {{InvalidRequestException}}? Swallowing exceptions by a very general catch-all clause is very dangerous. -- getToken L456-L462: {noformat} for (String key : validators.keySet()) keyValues[i++] = eqColumns.get(key); IPartitioner partitioner = ConfigHelper.getInputPartitioner(conf); if (keyValidator instanceof CompositeType) return partitioner.getToken(((CompositeType) keyValidator).build(keyValues)); else return partitioner.getToken(eqColumns.get(keys.get(0))); {noformat} validators is a HashMap and HashMaps do not preserve key order. The order of items in the keyValues array here may not match the order of the key columns in the keyValidator, therefore the values may be misplaced. If all key components are of the same type, this may fail in a very subtle / silent way. Besides that: Cassandra style of writing this would be to use a ternary operator: {noformat} return (keyValidator instanceof CompositeType) ? ... : ... {noformat} - getSplits L140-L147 {noformat} try { token = getToken(conf); } catch (Exception e) { throw new IOException(e); } {noformat} Given that this change is going to be included in a patch version of Cassandra, we should not increase the likelihood of failure here by throwing some additional exceptions, that previously could never happen. If getting a token fails, we should log the failure with the exception at ERROR level and continue without the token, because all this token thing is only an optimization. - ColumnFamilySplit L74: getPartitionKeyQuery should be called isPartitionKeyQuery - SplitCallable#call L293: {noformat} if (containToken) split.setPartitionKeyEqQuery(containToken); {noformat} Can be simplified to: {noformat} split.setPartitionKeyEqQuery(true); {noformat} containToken is always true at the point of reaching the if statement. Therefore you really don't need the containToken variable at all, and you can remove some earlier code related to setting it as well. == Overall I vote against putting this into 2.1.5, because it is a too big feature which may have effects on correctness and performance. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of cust
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496182#comment-14496182 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- I know this was this way from the beginning, but it is not a reason we shouldn't change it for better :) > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493123#comment-14493123 ] Alex Liu commented on CASSANDRA-8576: - ome one from Product Management should be able to answer it. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493122#comment-14493122 ] Alex Liu commented on CASSANDRA-8576: - Some one from Product Management should be able to answer it. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493117#comment-14493117 ] Alex Liu commented on CASSANDRA-8576: - It's been this way for very beginning. Internally, url decoding is used. I think it's not an easy way around here. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492934#comment-14492934 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- {noformat} pig.registerQuery("composite_rows = LOAD 'cql://cql3ks/compositekeytable?" + defaultParameters + nativeParameters + "&where_clause=key1%20%3D%20%27key1%27%20and%20key2%20%3D%20111%20and%20column1%3D100&page_size=2' USING CqlNativeStorage();"); {noformat} Things like this make my eyes cry. I know, this already was like this, but why can't we just specify the query in a human readable form and call a function to url encode it? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492929#comment-14492929 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- The whole {{AbstractColumnFamilyInputFormat#getToken}} thing - this is quite a complex piece of logic, and always invoked. Not sure if we really want to really merge it into 2.1.5. I'm afraid this may destabilize things. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492720#comment-14492720 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- AbstractColumnFamilyInputFormat#getToken: {noformat} if (keyValidator instanceof CompositeType) return partitioner.getToken(((CompositeType) keyValidator).build(keyValues)); /// <<< should be CompositeType.build, because this is a static method else return partitioner.getToken(eqColumns.get(keys.get(0))); {noformat}} > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492709#comment-14492709 ] Piotr Kołaczkowski commented on CASSANDRA-8576: --- {noformat} @@ -79,6 +90,7 @@ public class ColumnFamilySplit extends InputSplit implements Writable, org.apach { out.writeUTF(startToken); out.writeUTF(endToken); +out.writeBoolean(partitionKeyEqQuery); out.writeInt(dataNodes.length); {noformat} This is going to break mixed-version clusters. Hadoop tasks will error out in weird ways on a cluster with some nodes 2.1.4 and some 2.1.5. This is actually very unfortunate that split serialization doesn't write a length or version header first, so we could detect it properly on the clients. Are you sure we want to merge this feature in the middle of 2.1.x? Are we > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484093#comment-14484093 ] Aleksey Yeschenko commented on CASSANDRA-8576: -- CASSANDRA-8358 is taking a bit longer than I expected to review/commit. Could be delayed by a week or so more. Can you guys go ahead and review/commit this without 8358? I'll rebase CASSANDRA-8358 afterwards. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391765#comment-14391765 ] Alex Liu commented on CASSANDRA-8576: - pending on CASSANDRA-8358 > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.5 > > Attachments: 8576-2.1-branch.txt, 8576-trunk.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349527#comment-14349527 ] Alex Liu commented on CASSANDRA-8576: - Pig-test on trunk fails, Philip Thompson is fixing it. I attach the patch on trunk, but we need merge it with Philip Thompson's fix. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.4 > > Attachments: 8576-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339343#comment-14339343 ] Brandon Williams commented on CASSANDRA-8576: - LGTM, can you attach a version for trunk as well? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.4 > > Attachments: 8576-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329142#comment-14329142 ] Alex Liu commented on CASSANDRA-8576: - [~brandon.williams] Do u have time to review this ticket? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer >Assignee: Alex Liu > Fix For: 2.1.4 > > Attachments: 8576-2.1-branch.txt > > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276183#comment-14276183 ] Russell Alexander Spitzer commented on CASSANDRA-8576: -- For this particular use-case they only need EQ, but IN would be nice as well. > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8576) Primary Key Pushdown For Hadoop
[ https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276140#comment-14276140 ] Alex Liu commented on CASSANDRA-8576: - Should it work only for EQ predicates? Should it also include IN predicates? > Primary Key Pushdown For Hadoop > --- > > Key: CASSANDRA-8576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8576 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop >Reporter: Russell Alexander Spitzer > > I've heard reports from several users that they would like to have predicate > pushdown functionality for hadoop (Hive in particular) based services. > Example usecase > Table with wide partitions, one per customer > Application team has HQL they would like to run on a single customer > Currently time to complete scales with number of customers since Input Format > can't pushdown primary key predicate > Current implementation requires a full table scan (since it can't recognize > that a single partition was specified) -- This message was sent by Atlassian JIRA (v6.3.4#6332)