[GitHub] phoenix pull request:

2016-03-21 Thread djh4230
Github user djh4230 commented on the pull request:


https://github.com/apache/phoenix/commit/48e589773cbf46a10a2c1bd5cf483f2390ae1160#commitcomment-16779652
  
hi James,
We are using Phoenix for a while. We found the aggression operation ,like 
filter,group by,order by etc,cost too much cache of server side recently. We 
suspected  if there is a memory leak. But today i  found you have fixed a 
memory leak bug.  Could  you please describe the memory leak appearance in your 
case? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (PHOENIX-2535) Create shaded clients (thin + thick)

2016-03-21 Thread Sergey Soldatov (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated PHOENIX-2535:
-
Attachment: PHOENIX-2535-3.patch

Added missing packages to spark client. 

A quick question. Whether we need a separate spark client if the thick client 
will have shaded libraries that were cause problem with spark?

> Create shaded clients (thin + thick) 
> -
>
> Key: PHOENIX-2535
> URL: https://issues.apache.org/jira/browse/PHOENIX-2535
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Sergey Soldatov
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, 
> PHOENIX-2535-3.patch
>
>
> Having shaded client artifacts helps greatly in minimizing the dependency 
> conflicts at the run time. We are seeing more of Phoenix JDBC client being 
> used in Storm topologies and other settings where guava versions become a 
> problem. 
> I think we can do a parallel artifact for the thick client with shaded 
> dependencies and also using shaded hbase. For thin client, maybe shading 
> should be the default since it is new? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)

2016-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203950#comment-15203950
 ] 

Hadoop QA commented on PHOENIX-2535:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12794482/PHOENIX-2535-3.patch
  against master branch at commit cd8e86ca7170876a30771fcc16c027f8dc8dd386.
  ATTACHMENT ID: 12794482

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev patch that doesn't require tests.

{color:red}-1 javac{color}.  The applied patch generated 234 javac compiler 
warnings (more than the master's current 81 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
${project.basedir}/../../config/csv-bulk-load-config.properties
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
${project.basedir}/../../config/csv-bulk-load-config.properties
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">
+
implementation="org.apache.maven.plugins.shade.resource.IncludeResourceTransformer">

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/284//testReport/
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/284//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/284//console

This message is automatically generated.

> Create shaded clients (thin + thick) 
> -
>
> Key: PHOENIX-2535
> URL: https://issues.apache.org/jira/browse/PHOENIX-2535
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Sergey Soldatov
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, 
> PHOENIX-2535-3.patch
>
>
> Having shaded client artifacts helps greatly in minimizing the dependency 
> conflicts at the run time. We are seeing more of Phoenix JDBC client being 
> used in Storm topologies and other settings where guava versions become a 
> problem. 
> I think we can do a parallel artifact for the thick client with shaded 
> dependencies and also using shaded hbase. For thin client, maybe shading 
> should be the default since it is new? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-1311) HBase namespaces surfaced in phoenix

2016-03-21 Thread Ankit Singhal (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated PHOENIX-1311:
---
Attachment: PHOENIX-1311_v2.patch

Updated with 
* review comments
* drop schema construct(currently drop schema is not allowed if any table is 
present) let me know if we need to support deletion of all tables when schema 
is dropped(by giving control to user by setting a client side property)
* LocalIndex and viewIndex backward compatibility and prefixes are moved from 
schema to tablename.
* some more test cases

> HBase namespaces surfaced in phoenix
> 
>
> Key: PHOENIX-1311
> URL: https://issues.apache.org/jira/browse/PHOENIX-1311
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: nicolas maillard
>Assignee: Ankit Singhal
>Priority: Minor
> Fix For: 4.8.0
>
> Attachments: PHOENIX-1311.docx, PHOENIX-1311_v1.patch, 
> PHOENIX-1311_v2.patch, PHOENIX-1311_wip.patch, PHOENIX-1311_wip_2.patch
>
>
> Hbase (HBASE-8015) has the concept of namespaces in the form of 
> myNamespace:MyTable it would be great if Phoenix leveraged this feature to 
> give a database like feature on top of the table.
> Maybe to stay close to Hbase it could also be a create DB:Table...
> or DB.Table which is a more standard annotation?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-1121) Improve tracing in Phoenix

2016-03-21 Thread Menaka Madushanka (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204430#comment-15204430
 ] 

Menaka Madushanka commented on PHOENIX-1121:


Thank you very much James. I'll go through them.

> Improve tracing in Phoenix
> --
>
> Key: PHOENIX-1121
> URL: https://issues.apache.org/jira/browse/PHOENIX-1121
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: James Taylor
>  Labels: gsoc2016, tracing
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2786) Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat

2016-03-21 Thread churro morales (JIRA)
churro morales created PHOENIX-2786:
---

 Summary: Can MultiTableOutputFormat be used instead of 
MultiHfileOutputFormat
 Key: PHOENIX-2786
 URL: https://issues.apache.org/jira/browse/PHOENIX-2786
 Project: Phoenix
  Issue Type: Task
Reporter: churro morales


MultiHfileOutputFormat depends on a lot of HBase classes that it shouldn't 
depend on.  It seems like MultiHfileOutputFormat and MultiTableOutputFormat 
have the same goal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2786) Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat

2016-03-21 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204494#comment-15204494
 ] 

James Taylor commented on PHOENIX-2786:
---

[~maghamravikiran] & [~gabriel.reid] - we're trying to get rid of dependencies 
on any internal HBase APIs so that we don't have to maintain multiple Phoenix 
branches for each HBase branch. Any idea if It's feasible to use 
MultiTableOutputFormat versus MultiHfileOutputFormat?

[~churromorales] - is MultiTableOutputFormat available in 0.98 too?

> Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat
> 
>
> Key: PHOENIX-2786
> URL: https://issues.apache.org/jira/browse/PHOENIX-2786
> Project: Phoenix
>  Issue Type: Task
>Reporter: churro morales
>
> MultiHfileOutputFormat depends on a lot of HBase classes that it shouldn't 
> depend on.  It seems like MultiHfileOutputFormat and MultiTableOutputFormat 
> have the same goal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2786) Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat

2016-03-21 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204501#comment-15204501
 ] 

churro morales commented on PHOENIX-2786:
-

[~jamestaylor] Yes MultiTableOutputFormat is available in 0.98 and earlier 
versions as well. 

> Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat
> 
>
> Key: PHOENIX-2786
> URL: https://issues.apache.org/jira/browse/PHOENIX-2786
> Project: Phoenix
>  Issue Type: Task
>Reporter: churro morales
>
> MultiHfileOutputFormat depends on a lot of HBase classes that it shouldn't 
> depend on.  It seems like MultiHfileOutputFormat and MultiTableOutputFormat 
> have the same goal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)

2016-03-21 Thread Josh Mahonin (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204515#comment-15204515
 ] 

Josh Mahonin commented on PHOENIX-2535:
---

[~sergey.soldatov] Just tested this new patch with both Spark 1.6.0 and Spark 
1.5.2. I ran a very small test that loaded a dataframe from a table, and saved 
it back to another table and can verify that it worked properly. Good work! 

A few more eyes on this to verify other use cases don't break at runtime (e.g. 
sqlline, standard JDBC, Storm, Flume, Pig, etc.) would be a good idea, but this 
looks good to me.

Re: client-spark, it is no longer required with a properly shaded client JAR. 
It was a bit of a hack originally, so this patch helps make the 
documentation/deployment bits cleaner too.

> Create shaded clients (thin + thick) 
> -
>
> Key: PHOENIX-2535
> URL: https://issues.apache.org/jira/browse/PHOENIX-2535
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Sergey Soldatov
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, 
> PHOENIX-2535-3.patch
>
>
> Having shaded client artifacts helps greatly in minimizing the dependency 
> conflicts at the run time. We are seeing more of Phoenix JDBC client being 
> used in Storm topologies and other settings where guava versions become a 
> problem. 
> I think we can do a parallel artifact for the thick client with shaded 
> dependencies and also using shaded hbase. For thin client, maybe shading 
> should be the default since it is new? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2786) Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat

2016-03-21 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204747#comment-15204747
 ] 

Sergey Soldatov commented on PHOENIX-2786:
--

As for Bulk load stuff I would say that MultiTableOutputFormat is hardly can be 
used. It's designed to put data into hbase tables using mutations and using it 
for bulk load is meaningless. 

> Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat
> 
>
> Key: PHOENIX-2786
> URL: https://issues.apache.org/jira/browse/PHOENIX-2786
> Project: Phoenix
>  Issue Type: Task
>Reporter: churro morales
>
> MultiHfileOutputFormat depends on a lot of HBase classes that it shouldn't 
> depend on.  It seems like MultiHfileOutputFormat and MultiTableOutputFormat 
> have the same goal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2786) Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat

2016-03-21 Thread maghamravikiran (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15204754#comment-15204754
 ] 

maghamravikiran commented on PHOENIX-2786:
--

[~churromorales]  From what I see, MultiTableOutputFormat uses the Put / Delete 
mutation rather than writing to HFiles that MultiHfileOutputFormat does.  We 
definitely have seen times , for ex:  for a newly created table ,  doing direct 
writes to HBase perform way better than bulk load  but in general writing to 
HFiles performs better. 
I definitely agree to your valid point that the code in MultiHfileOutputFormat 
has a lot from HfileOutputFormat except for few minor changes. 

> Can MultiTableOutputFormat be used instead of MultiHfileOutputFormat
> 
>
> Key: PHOENIX-2786
> URL: https://issues.apache.org/jira/browse/PHOENIX-2786
> Project: Phoenix
>  Issue Type: Task
>Reporter: churro morales
>
> MultiHfileOutputFormat depends on a lot of HBase classes that it shouldn't 
> depend on.  It seems like MultiHfileOutputFormat and MultiTableOutputFormat 
> have the same goal. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2787) support IF EXISTS for ALTER TABLE SET options

2016-03-21 Thread Vincent Poon (JIRA)
Vincent Poon created PHOENIX-2787:
-

 Summary: support IF EXISTS for ALTER TABLE SET options
 Key: PHOENIX-2787
 URL: https://issues.apache.org/jira/browse/PHOENIX-2787
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.8.0
Reporter: Vincent Poon
Priority: Trivial


A nice-to-have improvement to the grammar:

ALTER TABLE my_table IF EXISTS SET options

currently the 'IF EXISTS' only works for dropping/adding a column



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PHOENIX-2788) Make transactions pluggable in Phoenix

2016-03-21 Thread James Taylor (JIRA)
James Taylor created PHOENIX-2788:
-

 Summary: Make transactions pluggable in Phoenix
 Key: PHOENIX-2788
 URL: https://issues.apache.org/jira/browse/PHOENIX-2788
 Project: Phoenix
  Issue Type: Bug
Reporter: James Taylor


Given that now there's another transactional library for transactions over 
HBase in Omid that will likely be entering the incubator soon, we should 
investigate what it'll take to make our transaction support pluggable. Omid may 
not be that difficult to plugin, given that its basic approach (snapshot 
isolation) is similar to Tephra's (but of course the devil's in the details).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2788) Make transactions pluggable in Phoenix

2016-03-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205004#comment-15205004
 ] 

stack commented on PHOENIX-2788:


What would it take [~giacomotaylor]? I thought the notion of pluggable 
transaction API was killed by the comment on the end of "HBASE-11447 Proposal 
for a generic transaction API for HBase" by [~ghelmling] where he notes 
transactions are begin/end/rollback at the highest level but when you dig in, 
there is a a lot of implementation specifics that is hard to abstract.

> Make transactions pluggable in Phoenix
> --
>
> Key: PHOENIX-2788
> URL: https://issues.apache.org/jira/browse/PHOENIX-2788
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Given that now there's another transactional library for transactions over 
> HBase in Omid that will likely be entering the incubator soon, we should 
> investigate what it'll take to make our transaction support pluggable. Omid 
> may not be that difficult to plugin, given that its basic approach (snapshot 
> isolation) is similar to Tephra's (but of course the devil's in the details).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2535) Create shaded clients (thin + thick)

2016-03-21 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205172#comment-15205172
 ] 

Sergey Soldatov commented on PHOENIX-2535:
--

[~jmahonin] Thank you for the feedback! sqlline and JDBC using squirrel is the 
part of my regular testing. It would be nice if someone try to use it with 
Storm, Flume and others apps 

> Create shaded clients (thin + thick) 
> -
>
> Key: PHOENIX-2535
> URL: https://issues.apache.org/jira/browse/PHOENIX-2535
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Sergey Soldatov
> Fix For: 4.8.0
>
> Attachments: PHOENIX-2535-1.patch, PHOENIX-2535-2.patch, 
> PHOENIX-2535-3.patch
>
>
> Having shaded client artifacts helps greatly in minimizing the dependency 
> conflicts at the run time. We are seeing more of Phoenix JDBC client being 
> used in Storm topologies and other settings where guava versions become a 
> problem. 
> I think we can do a parallel artifact for the thick client with shaded 
> dependencies and also using shaded hbase. For thin client, maybe shading 
> should be the default since it is new? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2780) Escape double quotation in dynamic field names

2016-03-21 Thread Sergey Soldatov (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205195#comment-15205195
 ] 

Sergey Soldatov commented on PHOENIX-2780:
--

It's an expected behavior since most of RDBMS doesn't support special 
characters in the column names.  

> Escape double quotation in dynamic field names
> --
>
> Key: PHOENIX-2780
> URL: https://issues.apache.org/jira/browse/PHOENIX-2780
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Powpow Shen
>
> UPSERT a row into a table with \' (escaped single quotation) in value is 
> allowed, but a row with \" (escaped double quotation) in field name is not 
> allowed. ex:
> {quote}
> upsert into "test"("id", "static", "dynamic" varchar) values (0, 's', 'd\'');
> {quote}
> is OK
> {quote}
> upsert into "test"("id", "static", "dynamic\"" varchar) values (0, 's', 'd');
> {quote}
> is NOT allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2780) Escape double quotation in dynamic field names

2016-03-21 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205229#comment-15205229
 ] 

James Taylor commented on PHOENIX-2780:
---

I think it's a bug as sometimes column names are generated. Would be good to 
know how other databases handle this.

> Escape double quotation in dynamic field names
> --
>
> Key: PHOENIX-2780
> URL: https://issues.apache.org/jira/browse/PHOENIX-2780
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Powpow Shen
>
> UPSERT a row into a table with \' (escaped single quotation) in value is 
> allowed, but a row with \" (escaped double quotation) in field name is not 
> allowed. ex:
> {quote}
> upsert into "test"("id", "static", "dynamic" varchar) values (0, 's', 'd\'');
> {quote}
> is OK
> {quote}
> upsert into "test"("id", "static", "dynamic\"" varchar) values (0, 's', 'd');
> {quote}
> is NOT allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2788) Make transactions pluggable in Phoenix

2016-03-21 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205263#comment-15205263
 ] 

Gary Helmling commented on PHOENIX-2788:


I really did not mean to kill discussion on HBASE-11447.  I was merely trying 
to dig into some important details of how the API would tie in to the rest of 
the HBase client and whether or not anyone making use of it would be tied 
directly to implementation specific code for common cases, both of which seemed 
to be missing in the current proposal.

Reviving HBASE-11447 and then using those APIs might be a way of making this 
pluggable from the Phoenix standpoint.  But from what I recall, the way that 
Phoenix coprocessors hooked in to changes for secondary indexes gets pretty 
deep into what the underlying transaction implementation is doing.  So there 
may be more plugging needed on the Phoenix side even if a common HBase API 
existed.

> Make transactions pluggable in Phoenix
> --
>
> Key: PHOENIX-2788
> URL: https://issues.apache.org/jira/browse/PHOENIX-2788
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Given that now there's another transactional library for transactions over 
> HBase in Omid that will likely be entering the incubator soon, we should 
> investigate what it'll take to make our transaction support pluggable. Omid 
> may not be that difficult to plugin, given that its basic approach (snapshot 
> isolation) is similar to Tephra's (but of course the devil's in the details).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2788) Make transactions pluggable in Phoenix

2016-03-21 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205519#comment-15205519
 ] 

James Taylor commented on PHOENIX-2788:
---

I agree with [~ghelmling] that there's quite a bit more needed on the Phoenix 
side to make transactions pluggable beyond the HBase API changes. The more 
similarity between the approach that transaction libraries have taken, the more 
likely it can be pluggable. I think it's possible between Tephra and Omid (both 
implementations of snapshot isolation), but I don't think it would extend well 
to the percolator-like approach taken by the XiaoMi folks for Themis.

To get a broad idea of how a transaction layer could be made pluggable, you can 
look at the components of the Tephra architecture. The way Tephra plugged into 
HBase helped tremendously in being able to integrate it with Phoenix.
- TransactionAwareHTable. This is a wrapper on HTable that delegates to the 
regular HTable, but attaches metadata to operations (used on the server-side) 
to make them transactional.
- Transaction Manager. This doles out transaction IDs and handles conflict 
detection. It also provides a means of getting the in-flight and invalid 
transactions IDs.
- Transaction Coprocessor. This handles attaching the visibility filter to 
filter invalid and inflight transactions, setting the cell timestamp to the 
transactionID, and converting deletes to the appropriate transaction-specific 
delete markers (see below for more on this).
- Transaction Janitor. Handles cleaning up invalid data on flush or compaction.

Some key interfaces and classes in Tephra are Transaction, TransactionContext, 
TransactionClient, and TransactionAware (see 
https://github.com/caskdata/tephra#client-apis for some good docs).

>From the Phoenix requirements standpoint, here are the detailed ways (and 
>reasons) we leverage the various interfaces and components of Tephra to 
>provide a reasonable solution. There could be alternate ways of Tephra 
>implementing these in HBase, I'm sure, but this is from the standpoint of how 
>they're implemented today in Tephra, Phoenix, and HBase, so hopefully this 
>gives an idea of the functionality that would need to become pluggable:
* *Enabling a client to see their own uncommitted writes*. This typically means 
that mutations to data (including deletes) are written to HBase, but filtered 
from scans of other clients until the commit is performed. This implies that 
you need a way to undo these changes if the commit fails or is manually rolled 
back. This is where we could use HBASE-11292. The alternative is to have your 
own family and cell delete markers (which end up just being Puts) so that they 
can be undone (which is what is done in Tephra). Without HBASE-11292, different 
transaction libraries would need to agree on what constitutes a delete marker 
to have a good interop story. There's also be a fair amount of duplication 
around implementing your own delete markers that'd be duplicated.
* *Query all versions of uncommitted data*. This was required for secondary 
index support, a driving reason for needing transactions, to enable table 
updates and the corresponding secondary index updates to be transactionally 
consistent. In order to be able to undo the index updates when a rollback 
occurs, we needed to be able to see all versions of mutations that were made in 
that transaction.
* *Getting inflight transaction IDs*. This was needed to handle adding a 
secondary index to a table that's taking writes, as it provided a means of 
ensuring that no writes to the table are missed when creating the secondary 
index. Tephra enables this does this by providing a few utility methods that 
enable read/write fences to be placed.
* *Transaction checkpointing*. Common in SQL implementations, there's a command 
that reads from a table and directly writes to the same table (UPSERT SELECT). 
In order for a client to still see their own uncommitted data, but not see the 
writes of that statement while in progress (or you can get into an infinite 
loop), you need a way of having multiple transaction IDs associated with a 
single transaction. In this way, you can see uncommitted data, but not see 
writes occurring for a given statement.
* *Cell timestamp that represent transaction ID*. Having transaction IDs 
represented in the Cell timestamp enables a consistent means of filtering based 
on transaction ID (a requirement for snapshot isolation). Because HBase only 
stores millisecond granularity in the Cell timestamp, Tephra has to multiply 
the timestamp by a million to get enough granularity for unique transaction IDs 
(and support more than one transaction per millisecond). This is where 
HBASE-8927 would help. The alternative is that transaction libraries agree on 
multiplying timestamps by a million.
* *Cell timestamp that corresponds to wall clock time*. Not every Ph

[jira] [Commented] (PHOENIX-2788) Make transactions pluggable in Phoenix

2016-03-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205801#comment-15205801
 ] 

stack commented on PHOENIX-2788:


I didn't think you were. You were just trying help doing an an actual 
abstraction and finding that it would take some work -- and then everyone ran 
away (smile).




> Make transactions pluggable in Phoenix
> --
>
> Key: PHOENIX-2788
> URL: https://issues.apache.org/jira/browse/PHOENIX-2788
> Project: Phoenix
>  Issue Type: Bug
>Reporter: James Taylor
>
> Given that now there's another transactional library for transactions over 
> HBase in Omid that will likely be entering the incubator soon, we should 
> investigate what it'll take to make our transaction support pluggable. Omid 
> may not be that difficult to plugin, given that its basic approach (snapshot 
> isolation) is similar to Tephra's (but of course the devil's in the details).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-2780) Escape double quotation in dynamic field names

2016-03-21 Thread Powpow Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205808#comment-15205808
 ] 

Powpow Shen commented on PHOENIX-2780:
--

For postgreSQL, creating static fields with \" is not allowed, but for dynamic 
column (field in json column) is allowed.

> Escape double quotation in dynamic field names
> --
>
> Key: PHOENIX-2780
> URL: https://issues.apache.org/jira/browse/PHOENIX-2780
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Powpow Shen
>
> UPSERT a row into a table with \' (escaped single quotation) in value is 
> allowed, but a row with \" (escaped double quotation) in field name is not 
> allowed. ex:
> {quote}
> upsert into "test"("id", "static", "dynamic" varchar) values (0, 's', 'd\'');
> {quote}
> is OK
> {quote}
> upsert into "test"("id", "static", "dynamic\"" varchar) values (0, 's', 'd');
> {quote}
> is NOT allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] phoenix pull request:

2016-03-21 Thread djh4230
Github user djh4230 commented on the pull request:


https://github.com/apache/phoenix/commit/31a414c84e64a1de366703cb1faa25c9e506d1a3#commitcomment-16795183
  
hi, james
I am using Phoenix for a while. Recently i found  it cost too much memcache 
in server side when i do aggression operation like group by etc. I suspect that 
there is a memory leak in server side. But i am not very sure. And i find you 
have fixed a bug about server side memory cache. Could you please describe the 
appreance in your case? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] phoenix pull request:

2016-03-21 Thread JamesRTaylor
Github user JamesRTaylor commented on the pull request:


https://github.com/apache/phoenix/commit/48e589773cbf46a10a2c1bd5cf483f2390ae1160#commitcomment-16795354
  
There was no symptoms, but only a warning in the logs. In this case, 
Phoenix is purely tracking memory usage, but wasn't issuing a close under some 
circumstances on the client and thus not freeing memory on the server (until a 
GC occurred). I don't think it's related to what you're seeing. Are you seeing 
this in our 4.7.0 release and if so can you reliably reproduce it? If you 
wouldn't mind filing a JIRA with the steps necessary to reproduce the issue, 
that be much appreciated. We have a lot of regression tests in place, but it's 
always possible something slipped through the cracks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---