[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718132#comment-14718132
 ] 

Ravi Kishore Valeti commented on PHOENIX-2154:
--

By the way, the patch I uploaded was on top of 4.5-HBase-0.98 branch. I hope 
this would not result in conflicts on master branch. If it results in 
conflicts, can upload a patch for master branch.

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-2212) Embed Tracing Web Application as a service

2015-08-28 Thread Nishani (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishani  reassigned PHOENIX-2212:
-

Assignee: Nishani 

> Embed Tracing Web Application as a service 
> ---
>
> Key: PHOENIX-2212
> URL: https://issues.apache.org/jira/browse/PHOENIX-2212
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
>
> The Tracing Web Application is deployed as a WAR file. However it needs to be 
> embedded as a service to Phoenix rather than a WAR file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2212) Embed Tracing Web Application as a service

2015-08-28 Thread Nishani (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishani  updated PHOENIX-2212:
--
Attachment: PHOENIX-2212.patch

Attaching the patch

> Embed Tracing Web Application as a service 
> ---
>
> Key: PHOENIX-2212
> URL: https://issues.apache.org/jira/browse/PHOENIX-2212
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
> Attachments: PHOENIX-2212.patch
>
>
> The Tracing Web Application is deployed as a WAR file. However it needs to be 
> embedded as a service to Phoenix rather than a WAR file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2212) Embed Tracing Web Application as a service

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718178#comment-14718178
 ] 

Hadoop QA commented on PHOENIX-2212:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12752951/PHOENIX-2212.patch
  against master branch at commit 2f128ee8bd2273397abb1f2989e83657181b0486.
  ATTACHMENT ID: 12752951

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/143//console

This message is automatically generated.

> Embed Tracing Web Application as a service 
> ---
>
> Key: PHOENIX-2212
> URL: https://issues.apache.org/jira/browse/PHOENIX-2212
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
> Attachments: PHOENIX-2212.patch
>
>
> The Tracing Web Application is deployed as a WAR file. However it needs to be 
> embedded as a service to Phoenix rather than a WAR file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2212) Embed Tracing Web Application as a service

2015-08-28 Thread Nishani (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishani  updated PHOENIX-2212:
--
Attachment: full-PHOENIX-2212.patch

(full-PHOENIX-2212.patch) patch is with updated pom.xml and removing files also

> Embed Tracing Web Application as a service 
> ---
>
> Key: PHOENIX-2212
> URL: https://issues.apache.org/jira/browse/PHOENIX-2212
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
> Attachments: PHOENIX-2212.patch, full-PHOENIX-2212.patch
>
>
> The Tracing Web Application is deployed as a WAR file. However it needs to be 
> embedded as a service to Phoenix rather than a WAR file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2212) Embed Tracing Web Application as a service

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718367#comment-14718367
 ] 

Hadoop QA commented on PHOENIX-2212:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12752977/full-PHOENIX-2212.patch
  against master branch at commit 2f128ee8bd2273397abb1f2989e83657181b0486.
  ATTACHMENT ID: 12752977

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/144//console

This message is automatically generated.

> Embed Tracing Web Application as a service 
> ---
>
> Key: PHOENIX-2212
> URL: https://issues.apache.org/jira/browse/PHOENIX-2212
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
> Attachments: PHOENIX-2212.patch, full-PHOENIX-2212.patch
>
>
> The Tracing Web Application is deployed as a WAR file. However it needs to be 
> embedded as a service to Phoenix rather than a WAR file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PHOENIX-2209) Building Local Index Asynchronously via IndexTool fails to populate index table

2015-08-28 Thread Rajeshbabu Chintaguntla (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla reassigned PHOENIX-2209:


Assignee: Rajeshbabu Chintaguntla

> Building Local Index Asynchronously via IndexTool fails to populate index 
> table
> ---
>
> Key: PHOENIX-2209
> URL: https://issues.apache.org/jira/browse/PHOENIX-2209
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.5.0
> Environment: CDH: 5.4.4
> HBase: 1.0.0
> Phoenix: 4.5.0 (https://github.com/SiftScience/phoenix/tree/4.5-HBase-1.0) 
> with hacks for CDH compatibility. 
>Reporter: Keren Gu
>Assignee: Rajeshbabu Chintaguntla
>  Labels: IndexTool, LocalIndex, index
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Using the Asynchronous Index population tool to create local index (of 1 
> column) on tables with 10 columns, and 65M, 250M, 340M, and 1.3B rows 
> respectively. 
> Table Schema as follows (with generic column names): 
> {quote}
> CREATE TABLE PH_SOJU_SHORT (
> id INT PRIMARY KEY,
> c2 VARCHAR NULL,
> c3 VARCHAR NULL,
> c4 VARCHAR NULL,
> c5 VARCHAR NULL,
> c6 VARCHAR NULL,
> c7 DOUBLE NULL,
> c8 VARCHAR NULL,
> c9 VARCHAR NULL,
> c10 BIGINT NULL
> )
> {quote}
> Example command used (for 65M row table): 
> {quote}
> 0: jdbc:phoenix:localhost> create local index LC_INDEX_SOJU_EVAL_FN on 
> PH_SOJU_SHORT(C4) async;
> {quote}
> And MR job started with command: 
> {quote}
> $ hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table 
> PH_SOJU_SHORT --index-table LC_INDEX_SOJU_EVAL_FN --output-path 
> LC_INDEX_SOJU_EVAL_FN_HFILE
> {quote}
> The IndexTool MR jobs finished in 18min, 77min, 77min, and 2hr 34min 
> respectively, but all index tables where empty. 
> For the table with 65M rows, IndexTool had 12 mappers and reducers. MR 
> Counters show Map input and output records = 65M, Reduce Input and output 
> records = 65M. PhoenixJobCounters input and output records are all 65M. 
> IndexTool Reducer Log tail: 
> {quote}
> ...
> 2015-08-25 00:26:44,687 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
> the last merge-pass, with 32 segments left of total size: 22805636866 bytes
> 2015-08-25 00:26:44,693 INFO [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output 
> Committer Algorithm version is 1
> 2015-08-25 00:26:44,765 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: hadoop.native.lib is 
> deprecated. Instead, use io.native.lib.available
> 2015-08-25 00:26:44,908 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is 
> deprecated. Instead, use mapreduce.job.skiprecords
> 2015-08-25 00:26:45,060 INFO [main] 
> org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:36:43,880 INFO [main] 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2: 
> Writer=hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/_temporary/attempt_1440094483400_5974_r_00_0/0/496b926ad624438fa08626ac213d0f92,
>  wrote=10737418236
> 2015-08-25 00:36:45,967 INFO [main] 
> org.apache.hadoop.hbase.io.hfile.CacheConfig: CacheConfig:disabled
> 2015-08-25 00:38:43,095 INFO [main] org.apache.hadoop.mapred.Task: 
> Task:attempt_1440094483400_5974_r_00_0 is done. And is in the process of 
> committing
> 2015-08-25 00:38:43,123 INFO [main] org.apache.hadoop.mapred.Task: Task 
> attempt_1440094483400_5974_r_00_0 is allowed to commit now
> 2015-08-25 00:38:43,132 INFO [main] 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of 
> task 'attempt_1440094483400_5974_r_00_0' to 
> hdfs://nameservice/user/ubuntu/LC_INDEX_SOJU_EVAL_FN/_LOCAL_IDX_PH_SOJU_EVAL/_temporary/1/task_1440094483400_5974_r_00
> 2015-08-25 00:38:43,158 INFO [main] org.apache.hadoop.mapred.Task: Task 
> 'attempt_1440094483400_5974_r_00_0' done.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718495#comment-14718495
 ] 

Ravi Kishore Valeti commented on PHOENIX-2154:
--

[~jamestaylor],
Looks like TableRecordWriter.close(TaskAttemptContext context) will be called 
per map task. In my earlier testing, I only had one mapper, so did not realize 
that. Just confirmed with [~sukuna...@gmail.com] on this.

[~tdsilva], please do not check-in. I will upload a new version by modifying 
the mapper and moving Index state update to Reducer.

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2200) Can phoenix support mapreduce with secure hbase(kerberos)?

2015-08-28 Thread lihuaqing (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718536#comment-14718536
 ] 

lihuaqing commented on PHOENIX-2200:


I am able to kinit successfully from the machine that is running the job 
driver. and I have put hbase-site.xml in the HADOOP_CLASSPATH. 

> Can phoenix support mapreduce with secure hbase(kerberos)? 
> ---
>
> Key: PHOENIX-2200
> URL: https://issues.apache.org/jira/browse/PHOENIX-2200
> Project: Phoenix
>  Issue Type: Bug
> Environment: hbase-0.98.12.1-hadoop2phoenix-4.5.0-HBase-0.98-bin
>Reporter: lihuaqing
>
> I can not work with phoenix mapreduce kerberos. My codes is as followings:
> final Configuration configuration = HBaseConfiguration.create();
> configuration.set("hbase.security.authentication","kerberos");
> configuration.set("hadoop.security.authentication", "kerberos");
> configuration.set("hbase.master.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set("hbase.regionserver.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set(QueryServices.HBASE_CLIENT_PRINCIPAL,"***");
> configuration.set(QueryServices.HBASE_CLIENT_KEYTAB,"***");
> final Job job = Job.getInstance(configuration, "phoenix-mr-job");
> // We can either specify a selectQuery or ignore it when we would like to 
> retrieve all the columns
> final String selectQuery = "SELECT 
> STOCK_NAME,RECORDING_YEAR,RECORDINGS_QUARTER FROM STOCK ";
> // StockWritable is the DBWritable class that enables us to process the 
> Result of the above query
> PhoenixMapReduceUtil.setInput(job, StockWritable.class, "STOCK",  
> selectQuery);  
> // Set the target Phoenix table and the columns
> PhoenixMapReduceUtil.setOutput(job, "STOCK_STATS", 
> "STOCK_NAME,MAX_RECORDING");
> job.setMapperClass(StockMapper.class);
> job.setReducerClass(StockReducer.class); 
> job.setOutputFormatClass(PhoenixOutputFormat.class);
> job.setMapOutputKeyClass(Text.class);
> job.setMapOutputValueClass(DoubleWritable.class);
> job.setOutputKeyClass(NullWritable.class);
> job.setOutputValueClass(StockWritable.class); 
> TableMapReduceUtil.addDependencyJars(job);
> job.waitForCompletion(true);
> I get the error statck as following:
> 2015-08-24 12:12:15,767 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: java.sql.SQLException: 
> ERROR 103 (08004): Unable to establish connection.
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:69)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:512)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.sql.SQLException: ERROR 103 (08004): Unable to establish 
> connection.
>   at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
>   at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:297)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:180)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1901)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
>   at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
>   at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
>   at java.sql.DriverManager.getConnection(DriverManager.java:579)
>   at java.sql.DriverManager.getConnection(DriverManager.java:190)
>   at 
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
>   at 
> org.apache.phoenix.mapreduce.util.ConnectionUt

[jira] [Commented] (PHOENIX-2200) Can phoenix support mapreduce with secure hbase(kerberos)?

2015-08-28 Thread lihuaqing (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718539#comment-14718539
 ] 

lihuaqing commented on PHOENIX-2200:


I am able to kinit successfully from the machine that is running the job 
driver. But have the error in the description section. what do I miss anything? 

> Can phoenix support mapreduce with secure hbase(kerberos)? 
> ---
>
> Key: PHOENIX-2200
> URL: https://issues.apache.org/jira/browse/PHOENIX-2200
> Project: Phoenix
>  Issue Type: Bug
> Environment: hbase-0.98.12.1-hadoop2phoenix-4.5.0-HBase-0.98-bin
>Reporter: lihuaqing
>
> I can not work with phoenix mapreduce kerberos. My codes is as followings:
> final Configuration configuration = HBaseConfiguration.create();
> configuration.set("hbase.security.authentication","kerberos");
> configuration.set("hadoop.security.authentication", "kerberos");
> configuration.set("hbase.master.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set("hbase.regionserver.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set(QueryServices.HBASE_CLIENT_PRINCIPAL,"***");
> configuration.set(QueryServices.HBASE_CLIENT_KEYTAB,"***");
> final Job job = Job.getInstance(configuration, "phoenix-mr-job");
> // We can either specify a selectQuery or ignore it when we would like to 
> retrieve all the columns
> final String selectQuery = "SELECT 
> STOCK_NAME,RECORDING_YEAR,RECORDINGS_QUARTER FROM STOCK ";
> // StockWritable is the DBWritable class that enables us to process the 
> Result of the above query
> PhoenixMapReduceUtil.setInput(job, StockWritable.class, "STOCK",  
> selectQuery);  
> // Set the target Phoenix table and the columns
> PhoenixMapReduceUtil.setOutput(job, "STOCK_STATS", 
> "STOCK_NAME,MAX_RECORDING");
> job.setMapperClass(StockMapper.class);
> job.setReducerClass(StockReducer.class); 
> job.setOutputFormatClass(PhoenixOutputFormat.class);
> job.setMapOutputKeyClass(Text.class);
> job.setMapOutputValueClass(DoubleWritable.class);
> job.setOutputKeyClass(NullWritable.class);
> job.setOutputValueClass(StockWritable.class); 
> TableMapReduceUtil.addDependencyJars(job);
> job.waitForCompletion(true);
> I get the error statck as following:
> 2015-08-24 12:12:15,767 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: java.sql.SQLException: 
> ERROR 103 (08004): Unable to establish connection.
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:69)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:512)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.sql.SQLException: ERROR 103 (08004): Unable to establish 
> connection.
>   at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
>   at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:297)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:180)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1901)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
>   at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
>   at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
>   at java.sql.DriverManager.getConnection(DriverManager.java:579)
>   at java.sql.DriverManager.getConnection(DriverManager.java:190)
>   at 
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:93)
>   at 
> org.apache.phoenix.mapreduce.

[jira] [Comment Edited] (PHOENIX-2200) Can phoenix support mapreduce with secure hbase(kerberos)?

2015-08-28 Thread lihuaqing (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718536#comment-14718536
 ] 

lihuaqing edited comment on PHOENIX-2200 at 8/28/15 1:14 PM:
-

I am able to kinit successfully from the machine that is running the job 
driver. and I have put hbase-site.xml in the HADOOP_CLASSPATH.  What do I miss 
anything? 


was (Author: scootli):
I am able to kinit successfully from the machine that is running the job 
driver. and I have put hbase-site.xml in the HADOOP_CLASSPATH. 

> Can phoenix support mapreduce with secure hbase(kerberos)? 
> ---
>
> Key: PHOENIX-2200
> URL: https://issues.apache.org/jira/browse/PHOENIX-2200
> Project: Phoenix
>  Issue Type: Bug
> Environment: hbase-0.98.12.1-hadoop2phoenix-4.5.0-HBase-0.98-bin
>Reporter: lihuaqing
>
> I can not work with phoenix mapreduce kerberos. My codes is as followings:
> final Configuration configuration = HBaseConfiguration.create();
> configuration.set("hbase.security.authentication","kerberos");
> configuration.set("hadoop.security.authentication", "kerberos");
> configuration.set("hbase.master.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set("hbase.regionserver.kerberos.principal","hbase/_HOST@DATA.SCLOUD");
> configuration.set(QueryServices.HBASE_CLIENT_PRINCIPAL,"***");
> configuration.set(QueryServices.HBASE_CLIENT_KEYTAB,"***");
> final Job job = Job.getInstance(configuration, "phoenix-mr-job");
> // We can either specify a selectQuery or ignore it when we would like to 
> retrieve all the columns
> final String selectQuery = "SELECT 
> STOCK_NAME,RECORDING_YEAR,RECORDINGS_QUARTER FROM STOCK ";
> // StockWritable is the DBWritable class that enables us to process the 
> Result of the above query
> PhoenixMapReduceUtil.setInput(job, StockWritable.class, "STOCK",  
> selectQuery);  
> // Set the target Phoenix table and the columns
> PhoenixMapReduceUtil.setOutput(job, "STOCK_STATS", 
> "STOCK_NAME,MAX_RECORDING");
> job.setMapperClass(StockMapper.class);
> job.setReducerClass(StockReducer.class); 
> job.setOutputFormatClass(PhoenixOutputFormat.class);
> job.setMapOutputKeyClass(Text.class);
> job.setMapOutputValueClass(DoubleWritable.class);
> job.setOutputKeyClass(NullWritable.class);
> job.setOutputValueClass(StockWritable.class); 
> TableMapReduceUtil.addDependencyJars(job);
> job.waitForCompletion(true);
> I get the error statck as following:
> 2015-08-24 12:12:15,767 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: java.sql.SQLException: 
> ERROR 103 (08004): Unable to establish connection.
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>   at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:69)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:512)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.sql.SQLException: ERROR 103 (08004): Unable to establish 
> connection.
>   at 
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
>   at 
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:297)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:180)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1901)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
>   at 
> org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1880)
>   at 
> org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
>   at 
> org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
>   at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
>   at java.sql.DriverManager.getConnection(DriverManager.java

[jira] [Updated] (PHOENIX-2206) Math function called with PK column ignores WHERE clause

2015-08-28 Thread James Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2206:
--
Attachment: PHOENIX-2206_v2.patch

Rebased on master

> Math function called with PK column ignores WHERE clause
> 
>
> Key: PHOENIX-2206
> URL: https://issues.apache.org/jira/browse/PHOENIX-2206
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.5.0
>Reporter: ckran
>Assignee: James Taylor
>Priority: Minor
>  Labels: exp, filter
> Attachments: PHOENIX-2206.patch, PHOENIX-2206_v2.patch
>
>
> A WHERE condition that uses a scalar function such as EXP is ignored and all 
> rows are returned. 
> create table test (id integer primary key) ;
> upsert into test values (1) ;
> upsert into test values (2) ;
> upsert into test values (3) ;
> upsert into test values (4) ;
> upsert into test values (5) ;
> select ID, exp(ID) from test where exp(ID) < 10 ;
> Result is:
> 1 2.718281828459045
> 2 7.38905609893065
> 3 20.085536923187668
> 4 54.598150033144236
> 5 148.4131591025766



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2217) Error Nested aggregate functions are not supported in the following quiry

2015-08-28 Thread Boris Furchin (JIRA)
Boris Furchin created PHOENIX-2217:
--

 Summary: Error Nested aggregate functions are not supported in the 
following quiry
 Key: PHOENIX-2217
 URL: https://issues.apache.org/jira/browse/PHOENIX-2217
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.5.0
 Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 
10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Boris Furchin


SELECT
MAX('') ,
MAX(i)
FROM
   myjunk

To reproduce:
create table myjunk(i integer primary key)
upsert into myjunk values (1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2186) Creating backend services for the Phoenix Tracing Web App

2015-08-28 Thread Nishani (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishani  updated PHOENIX-2186:
--
Labels: Tracing  (was: )

> Creating backend services for the Phoenix Tracing Web App
> -
>
> Key: PHOENIX-2186
> URL: https://issues.apache.org/jira/browse/PHOENIX-2186
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: Nishani 
>Assignee: Nishani 
>  Labels: Tracing
>
> This will include the following components.
> Main class 
> Pom file
> Launch script 
> Backend trace service API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2206) Math function called with PK column ignores WHERE clause

2015-08-28 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720379#comment-14720379
 ] 

Thomas D'Silva commented on PHOENIX-2206:
-

+1

> Math function called with PK column ignores WHERE clause
> 
>
> Key: PHOENIX-2206
> URL: https://issues.apache.org/jira/browse/PHOENIX-2206
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.5.0
>Reporter: ckran
>Assignee: James Taylor
>Priority: Minor
>  Labels: exp, filter
> Attachments: PHOENIX-2206.patch, PHOENIX-2206_v2.patch
>
>
> A WHERE condition that uses a scalar function such as EXP is ignored and all 
> rows are returned. 
> create table test (id integer primary key) ;
> upsert into test values (1) ;
> upsert into test values (2) ;
> upsert into test values (3) ;
> upsert into test values (4) ;
> upsert into test values (5) ;
> select ID, exp(ID) from test where exp(ID) < 10 ;
> Result is:
> 1 2.718281828459045
> 2 7.38905609893065
> 3 20.085536923187668
> 4 54.598150033144236
> 5 148.4131591025766



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: PHOENIX-2154_HBase_Frontdoor_API_v2.patch

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2198) Support correlate variable

2015-08-28 Thread Maryann Xue (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue updated PHOENIX-2198:
-
Attachment: PHOENIX-2198.patch

1. Added RuntimeContext/RuntimeContextImpl for setting and getting correlate 
variables.
2. Added CorrelateVariableFieldAccessExpression which will be evaluated at 
runtime and serialized to server (if necessary) as LiteralExpression.
3. Added CorrelatedPlan and CorrelatedPlanTest.
4. Added dynamicFilter to BaseQueryPlan. The compiler will be responsible for 
testing if the filter expression contains any 
CorrelateVariableFieldAccessExpression and if this "dynamic part" will have 
impact on ScanRanges. This dynamic filter should be set when both conditions 
return true. (This part is in calcite branch.)
5. Made minor changes to HashJoinPlan and UngroupedAggregatingResultIterator.

> Support correlate variable
> --
>
> Key: PHOENIX-2198
> URL: https://issues.apache.org/jira/browse/PHOENIX-2198
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Maryann Xue
>Assignee: Maryann Xue
> Attachments: PHOENIX-2198.patch
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> This will enable the outer query to set a correlate variable as a parameter 
> to restart the inner query for each iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720471#comment-14720471
 ] 

Ravi Kishore Valeti commented on PHOENIX-2154:
--

Uploaded v2 - moved index state change step to reducer. 
Each Map task will just write a single dummy key value to avoid outputting all 
records to reducer.

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: PHOENIX-2154-_HBase_Frontdoor_API_v2.patch

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: (was: PHOENIX-2154_HBase_Frontdoor_API_v2.patch)

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: (was: PHOENIX-2154-_HBase_Frontdoor_API_v2.patch)

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: PHOENIX-2154-_HBase_Frontdoor_API_v2.patch

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: PHOENIX-2154-_HBase_Frontdoor_API_v2.patch

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Kishore Valeti updated PHOENIX-2154:
-
Attachment: (was: PHOENIX-2154-_HBase_Frontdoor_API_v2.patch)

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720593#comment-14720593
 ] 

James Taylor commented on PHOENIX-2154:
---

If we don't write the dummy key value in the mapper.cleanup() method, what 
happens? Will the reducer run over all the KeyValues we generated during the 
map phase? Does writing that dummy key value prevent this? Seems somewhat 
weird, but I'm +1 if it works (and is necessary).

[~tdsilva] - would you mind reviewing too?

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Ravi Kishore Valeti (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720635#comment-14720635
 ] 

Ravi Kishore Valeti commented on PHOENIX-2154:
--

Ex: we have a single map task with 1000 records.

If we do context.write() in map() method, no.of input records for reducer=1000
If we do context.write() in cleanup() method ( this will be called once all 
records are processed by map()), by writing some dummy key-value, then no. Of 
input records to reduce = no.of map tasks=1.

I tested below cases

- context.write(key,value) in map() method
Result:  map input records = 1000
 map output records = 1000
 Reducer input records = 1000
- context.write(key,value) in cleanup() method
Result:  map input records = 1000
 map output records = 1
 Reducer input records = 1

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720645#comment-14720645
 ] 

James Taylor commented on PHOENIX-2154:
---

I see. LGTM. Nice work, [~rvaleti]. Let's get this committed and then perf test 
it.

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-2182) Pherf - Add ability to compare of run(s) and generate warning if performance degrades beyond set threshold

2015-08-28 Thread Mujtaba Chohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mujtaba Chohan updated PHOENIX-2182:

Attachment: PHOENIX-2182.patch

Attached patch allows run to be labeled using -label command line argument. 
Once a run is labeled, it can be used for comparison (using -compare command 
line argument) and generate a comparative chart based on Google bar chart.

> Pherf - Add ability to compare of run(s) and generate warning if performance 
> degrades beyond set threshold
> --
>
> Key: PHOENIX-2182
> URL: https://issues.apache.org/jira/browse/PHOENIX-2182
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Mujtaba Chohan
> Attachments: PHOENIX-2182.patch
>
>
> Add ability to compare of run(s) and generate warning if performance degrades 
> beyond set threshold. This would also need that runs can be labeled for known 
> baselines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1812) Only sync table metadata when necessary

2015-08-28 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-1812:

Attachment: PHOENIX-1812-v6.patch

[~jamestaylor] I have attached a patch with the review feedback. Please review 
when you get a chance. 

> Only sync table metadata when necessary
> ---
>
> Key: PHOENIX-1812
> URL: https://issues.apache.org/jira/browse/PHOENIX-1812
> Project: Phoenix
>  Issue Type: Sub-task
>Reporter: James Taylor
>Assignee: Thomas D'Silva
> Attachments: PHOENIX-1812-v2.patch, PHOENIX-1812-v3.patch, 
> PHOENIX-1812-v4-WIP.patch, PHOENIX-1812-v5.patch, PHOENIX-1812-v6.patch, 
> PHOENIX-1812.patch, PHOENIX-1812.patch, PHOENIX-1812.patch
>
>
> With transactions, we hold the timestamp at the point when the transaction 
> was opened. We can prevent the MetaDataEndpoint getTable RPC in 
> MetaDataClient.updateCache() to check that the client has the latest table if 
> we've already checked at the current transaction ID timestamp. We can keep 
> track of which tables we've already updated in PhoenixConnection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1700) Remove ColumnProjectionFilter

2015-08-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720782#comment-14720782
 ] 

Lars Hofhansl commented on PHOENIX-1700:


I think we can do this now. [~giacomotaylor], [~samarthjain], [~mujtabachohan]. 
Might want to do a round perf testing.


> Remove ColumnProjectionFilter
> -
>
> Key: PHOENIX-1700
> URL: https://issues.apache.org/jira/browse/PHOENIX-1700
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Now that HBASE-13109 is committed, we no longer need this optimization. HBase 
> will itself do the right thing and cover a much wider range cases.
> HBASE-13109 will be in 0.98.12, so we don't want to remove this immediately, 
> but maybe with the first version of Phoenix supporting HBase 1.0.x.
> Just filing as a reminder for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720858#comment-14720858
 ] 

Thomas D'Silva commented on PHOENIX-2154:
-

+1 I will get this committed.

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720910#comment-14720910
 ] 

Hudson commented on PHOENIX-2154:
-

SUCCESS: Integrated in Phoenix-master #883 (See 
[https://builds.apache.org/job/Phoenix-master/883/])
PHOENIX-2154 Failure of one mapper should not affect other mappers in MR index 
build (Ravi Kishore Valeti) (tdsilva: rev 
16fcdf9e1c116758027b79a24f9ec701cb63496f)
* 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexToolUtil.java
* 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/PhoenixIndexImportDirectMapper.java
* phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexTool.java
* 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/DirectHTableWriter.java
* phoenix-core/src/it/java/org/apache/phoenix/mapreduce/IndexToolIT.java
* 
phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/PhoenixIndexToolReducer.java


> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

2015-08-28 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva resolved PHOENIX-2154.
-
Resolution: Fixed

> Failure of one mapper should not affect other mappers in MR index build
> ---
>
> Key: PHOENIX-2154
> URL: https://issues.apache.org/jira/browse/PHOENIX-2154
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>Assignee: Ravi Kishore Valeti
> Attachments: IndexTool.java, PHOENIX-2154-WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_WIP.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v1.patch, 
> PHOENIX-2154-_HBase_Frontdoor_API_v2.patch
>
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done 
> in the event of the failure of one of the other mappers. The initial 
> population of an index is based on a snapshot in time, so new rows getting 
> *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so 
> there's really no need to dedup. However, the index rows will have a 
> different row key than the data table, so I'm not sure how the HFiles are 
> split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2198) Support correlate variable

2015-08-28 Thread Maryann Xue (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720934#comment-14720934
 ] 

Maryann Xue commented on PHOENIX-2198:
--

I got a random failure in StatsCollectorWithSplitsAndMultiCFIT verifying this 
patch (actually only once out of a few times). Does it look like something that 
has been seen before, [~jamestaylor]?

StatsCollectorWithSplitsAndMultiCFIT.testSplitUpdatesStats:225 expected:<8> but 
was:<7>

> Support correlate variable
> --
>
> Key: PHOENIX-2198
> URL: https://issues.apache.org/jira/browse/PHOENIX-2198
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Maryann Xue
>Assignee: Maryann Xue
> Attachments: PHOENIX-2198.patch
>
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> This will enable the outer query to set a correlate variable as a parameter 
> to restart the inner query for each iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)