Re: Multiple upserts via JDBC

2016-02-19 Thread Sergey Soldatov
Hi Zack, Have you tried to use sqlline to manually do those upserts to check the performance? Information about the tables structures would be useful as well. Thanks, Sergey On Tue, Feb 16, 2016 at 8:10 AM, Riesland, Zack wrote: > I have a handful of VERY small phoenix tables (< 100 entries). >

Re: Multiple upserts via JDBC

2016-02-19 Thread Sergey Soldatov
ser_access PRIMARY KEY (user_id, screen_id) > ); > > -Original Message----- > From: sergey.solda...@gmail.com [mailto:sergey.solda...@gmail.com] On Behalf > Of Sergey Soldatov > Sent: Friday, February 19, 2016 3:01 AM > To: user@phoenix.apache.org > Subject: Re: Mult

Re: Phoenix rowkey

2016-02-22 Thread Sergey Soldatov
You do it exactly the way you described. Separate varchars by zero and use fixed 8 bytes for the long. So, for example if you have (varchar, u_long, varchar) primary key, the rowkey for values like 'X',1,'Y' will be : ('X', 0x00) (0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01), ('Y') Thanks, Se

Re: Looks Like a SELECT Bug, But LIMIT Makes It Work

2016-02-23 Thread Sergey Soldatov
Hi Steve, It looks like a bug. So, please file a JIRA. Thanks, Sergey On Tue, Feb 23, 2016 at 12:52 PM, Steve Terrell wrote: > I came across a 4.6.0 query that I could not make work unless I add a > "limit" to the end, where it should be totally unnecessary. > > select * from BUGGY where F1=1 an

Re: HBase Phoenix Integration

2016-02-29 Thread Sergey Soldatov
Hi Amit, Switching to 4.3 means you need HBase 0.98. What kind of problem you experienced after building 4.6 from sources with changes suggested on StackOverflow? Thanks, Sergey On Sun, Feb 28, 2016 at 10:49 PM, Amit Shah wrote: > An update - > > I was able to execute "./sqlline.py " command bu

Re: Problem with CDH 5.5 with enabling mutable index on table

2016-03-11 Thread Sergey Soldatov
Hi Mohammad, It seems that you are using parcels in your distribution. IIRC in this case services doesn't use configuration files located at /etc/... They are for clients only. Try to add those parameters using Cloudera Manager HBase safety valve. Thanks, Sergey

Re: Phoenix table is unaccessable...

2016-03-11 Thread Sergey Soldatov
Hi Saurabh, It seems that your SYSTEM.CATALOG got corrupted somehow. Usually you need to disable and drop 'SYSTEM.CATALOG' in hbase shell. After that restart sqlline (it will automatically recreate system catalog) and recreate all user tables. The table data usually is not affected, but just in ca

Re: Phoenix table is inaccessible...

2016-03-11 Thread Sergey Soldatov
ne > > > ----- Original Message - > From: Sergey Soldatov > To: SAURABH AGARWAL, user@phoenix.apache.org > CC: ANIRUDHA JADHAV > At: 11-Mar-2016 19:07:31 > > Hi Saurabh, > It seems that your SYSTEM.CATALOG got corrupted somehow. Usually you > need to disable and drop &

Re: Kerberos ticket renewal

2016-03-19 Thread Sergey Soldatov
Where do you see this error? Is it the client side? Ideally you don't need to renew ticket since Phoenix Driver gets the required information (principal name and keytab path) from jdbc connection string and performs User.login itself. Thanks, Sergey On Wed, Mar 16, 2016 at 11:02 AM, Sanooj Padmak

Re: Extend CSVBulkLoadTool

2016-03-21 Thread Sergey Soldatov
Hi Anil, It will be really painful since CSV bulk load is using Apache common CSV format tool for parsing input lines and it expects that the delimiter is a single character. I would suggest to prepare files before bulk load replacing the delimiter string with a single character using perl/sed scri

Re: How phoenix converts Integer to byte array under the hood

2016-03-22 Thread Sergey Soldatov
Hi Mohammad, The right class to look into is PInteger. It has static class IntCodec which is using for code/decode integers. Thanks, Sergey On Tue, Mar 22, 2016 at 7:15 AM, Mohammad Adnan Raza wrote: > I am changing my question a bit to be more precise... > Given a phoenix table with INTEGER col

Re: Extend CSVBulkLoadTool

2016-03-22 Thread Sergey Soldatov
my file as > comma used as valid char for my data. > > From my understanding of the code, number of splits in csv record must match > with number of columns. Agree? > > Regards, > Anil > > > > On 21 March 2016 at 23:52, Sergey Soldatov wrote: >> >> Hi A

Re: How phoenix converts Integer to byte array under the hood

2016-03-23 Thread Sergey Soldatov
ay > start breaking. > > > > On Tue, Mar 22, 2016 at 11:11 PM, Sergey Soldatov > wrote: >> >> Hi Mohammad, >> The right class to look into is PInteger. It has static class IntCodec >> which is using for code/decode integers. >> >> Thanks, >

Re: How to use VARBINARY with CsvBulkLoadTool

2016-03-29 Thread Sergey Soldatov
Hi Jon, Base64 is supposed to be. Thanks, Sergey On Tue, Mar 29, 2016 at 12:38 PM, Cox, Jonathan A wrote: > I am wondering how I can use the CsvBulkLoadTool to insert binary data to a > table. For one thing, which format does CsvBulkLoadTool expect the data to > be encoded as within the CSV, whe

Re: [EXTERNAL] Re: How to use VARBINARY with CsvBulkLoadTool

2016-03-30 Thread Sergey Soldatov
mn that maps to a VARBINARY in the table, it will automatically interpret > it as base64 and decode/store it? I don't have to specify that my CSV file > contains base64? > > Thanks, > Jon > > -Original Message- > From: sergey.solda...@gmail.com [mailto:ser

Re: [EXTERNAL] RE: Problem Bulk Loading CSV with Empty Value at End of Row

2016-03-30 Thread Sergey Soldatov
Hi Jon, One of the parameters for CsvBulkLoadTool is -e that specify the escape symbol. You may specify a symbol which is not supposed to be in the data if you don't need the support for escaped sequences. Please note, that escaped characters are not supported in the command line (they will as soo

Re: TEXT Data type in Phoenix?

2016-03-30 Thread Sergey Soldatov
Jon, It seems that documentation is a bit outdated. VARCHAR supports exactly what you want: create table x (id bigint primary key, x varchar); upsert into x values (1, ". (a lot of text there) " ); 0: jdbc:phoenix:localhost> select length(x) from x; ++ | LENGTH(X) | ++

Re: [EXTERNAL] Re: TEXT Data type in Phoenix?

2016-03-30 Thread Sergey Soldatov
> use a pre-defined length for VARCHAR? Or is it really all the same under the > hood? > > -Jonathan > > -Original Message- > From: sergey.solda...@gmail.com [mailto:sergey.solda...@gmail.com] On Behalf > Of Sergey Soldatov > Sent: Wednesday, March 30, 2016 5

Re: duplicated collumn name

2016-05-24 Thread Sergey Soldatov
Hi, Since the table was not properly created, the only reasonable solution is to delete it: delete from SYSTEM.CATALOG where TABLE_NAME='TABLE_20160511'; And in hbase shell disable 'TABLE_20160511' drop 'TABLE_20160511' Thanks, Sergey On Tue, May 24, 2016 at 2:04 AM, Tim Polach wrote: > Hi eve

Re: Storage benefits of ARRAY types

2016-05-25 Thread Sergey Soldatov
HI Sumanta, It's obvious. If it's fixed length, serialized values are stored one by one. If data type has variable length, than a special separator is inserted between values. Thanks, Sergey On Wed, May 25, 2016 at 4:07 AM, Sumanta Gh wrote: > Hi, > I found that when a VARCHAR ARRAY is stored i

Re: NoClassDefFoundError org/apache/hadoop/hbase/HBaseConfiguration

2016-07-05 Thread Sergey Soldatov
Robert, you should use the phoenix-4*-spark.jar that is located in root phoenix directory. Thanks, Sergey On Tue, Jul 5, 2016 at 8:06 AM, Josh Elser wrote: > Looking into this on the HDP side. Please feel free to reach out via HDP > channels instead of Apache channels. > > Thanks for letting u

Re: Scanning big region parallely

2016-10-21 Thread Sergey Soldatov
Hi Sanooj, You may take a look at BaseResulterators.getIterators() and BaseResultIterators.getParallelScans() Thanks, Sergey On Fri, Oct 21, 2016 at 6:02 AM, Sanooj Padmakumar wrote: > Hi all > > If anyone can provide some information as to which part of the phoenix > code we need to check to

Re: Creating Covering index on Phoenix

2016-10-21 Thread Sergey Soldatov
Hi Mich, It's really depends on the query that you are going to use. If conditions will be applied only by time column you may create index like create index I on "marketDataHbase" ("timecreated") include ("ticker", "price"); If the conditions will be applied on others columns as well, you may use

Re: Creating Covering index on Phoenix

2016-10-23 Thread Sergey Soldatov
imer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages ari

Re: Not working as expected

2016-10-25 Thread Sergey Soldatov
Hi Vivek, You may use meta information for the connection. Once you created connection using Phoenix jdbc-driver, you may run following code: ResultSet rs = connection.meta.getTables(connection.meta.getConnection().getCatalog(), null, "%", new String[] {"TABLE"}); Iterating over the result set y

Re: Phoenix + Spark

2016-10-26 Thread Sergey Soldatov
(1) You need only client jar (phoenix--client.jar) (2) set spark.executor.extraClassPath in the spark-defaults.conf to the client jar Hope that would help. Thanks, Sergey On Tue, Oct 25, 2016 at 9:31 PM, min zou wrote: > Dear, i use spark to do data analysis,then save the result to Phonix.

Re: Phoenix + Spark

2016-10-26 Thread Sergey Soldatov
e( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) &

Re: Phoenix Slow Problem

2016-11-02 Thread Sergey Soldatov
Hi Fawaz, Actually explain plan says that there will be 6 parallel full scans. I believe that's the number of regions you have. If you want to increase the number of parallel scans you may think about setting phoenix.stats.guidepost.width to something smaller than default value and scans will be ex

Re: Statistics collection in Phoenix

2016-11-02 Thread Sergey Soldatov
Hi Mich, The statistic is stored in SYSTEM.STATS table. And yes, there are guideposts per column family. As for (3) and (4) I think the answer is no. Guideposts are more like a point for specific row key (so if we scan for specific row key we can find quickly whether to start scanning) and let us r

Re: Read only user permissions to Phoenix table - Phoenix 4.5

2017-02-16 Thread Sergey Soldatov
Unfortunately some versions of Phoenix client is using HBase API (such as getHTableDescriptor) that requires HBase CREATE/ADMIN permissions on system tables. Moreover the upgrade path is trying to create system tables to check whether system requires an upgrade and that may fail with permission exc

Re: How to migrate sql cascade and foreign keys

2017-03-08 Thread Sergey Soldatov
Well, Apache Phoenix doesn't support foreign key, so you need to manage this functionality on your application layer. Sometimes, depending on the scenario you may emulate this functionality using VIEWs for user table with additional columns instead of creating a set of separated tables. More inform

Re: phoenix client config and memory

2017-03-08 Thread Sergey Soldatov
Hi, You may specify HBase config by setting HBASE_CONF_DIR env variable. Or just put the directory that has it in the classpath. If no hbase-site.xml found, the default values will be used. As for the memory, it really depends on the usage scenario. If you have large tables with thousands of region

Re: write dataframe to phoenix

2017-03-27 Thread Sergey Soldatov
Hi Sateesh, You need only -client.jar. I noticed a problem in your sample. zkUrl is the Zookeeper url, but not jdbc connection string. So remove 'jdbc:phoenix:'. Also check that you provide the correct zk parent in the connection string (if it different from /hbase) I would also recommend to add i

Re: How can I "use" a hbase co-processor from a User Defined Function?

2017-04-17 Thread Sergey Soldatov
No. UDF function doesn't have any context where it's executed, so it can't obtain neither region instance nor coprocessor instance Thanks, Sergey On Fri, Apr 14, 2017 at 10:39 AM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.com> wrote: > would *my_udf* be executed on the region server that the

Re: Problem connecting JDBC client to a secure cluster

2017-04-17 Thread Sergey Soldatov
That's not hbase-site.xml loaded incorrectly. This is the behavior of java classpath. It's accept only jars and directories. So if any resources should be added to the classpath other than jars, you need to add to the classpath the directory where they are located. Thanks, Sergey On Tue, Apr 11,

Re: How can I "use" a hbase co-processor from a User Defined Function?

2017-04-18 Thread Sergey Soldatov
Well, theoretically there is a way of having a coprocessor that will keep static public map of current rowkey processed by Phoenix and the correlated HRegion instance and get this HRegion using the key that is processed by evaluate function. But it's a completely wrong approach for both HBase and P

Re: How can I "use" a hbase co-processor from a User Defined Function?

2017-04-18 Thread Sergey Soldatov
ot the client*.* > > Regards, > > Cheyenne O. Forbes > > > On Tue, Apr 18, 2017 at 1:22 PM, James Taylor > wrote: > >> Shorter answer is "no". Your UDF may be executed on the client side as >> well (depending on the query) and there is of course no HRegio

Re: Are arrays stored and retrieved in the order they are added to phoenix?

2017-04-18 Thread Sergey Soldatov
Of course they are stored in the same order, but using special encoding. It's explained in PArrayDataType: /** * The datatype for PColummns that are Arrays. Any variable length array would follow the below order. Every element * would be seperated by a seperator byte '0'. Null elements are count

Re: How can I "use" a hbase co-processor from a User Defined Function?

2017-04-18 Thread Sergey Soldatov
t; Cheyenne O. Forbes > > > On Tue, Apr 18, 2017 at 4:36 PM, Sergey Soldatov > wrote: > >> I may be wrong, but you have chosen wrong approach. Such kind of >> integration need to be (should be) done on the Phoenix layer in the way >> like global/local indexes are implement

Re: How can I "use" a hbase co-processor from a User Defined Function?

2017-04-19 Thread Sergey Soldatov
d passed to the UDF is located and the >>> value returned my* "getFilesystem()" *of* "**HRegion", *what do you >>> recommend that I do? >>> >>> Regards, >>> >>> Cheyenne O. Forbes >>> >>> >>> >&

Re: ORDER BY not working with UNION ALL

2017-05-10 Thread Sergey Soldatov
Well, even if you don't use family, you will get an error that column date_time is undefined. Consider the result of UNION ALL as a separate table and ORDER BY is applied to this table. You don't have column date_time there. Don't forget that UNION ALL may work with different tables, so there may b

Re: Row timestamp usage

2017-05-22 Thread Sergey Soldatov
AFAIK depending on the version of Phoenix you are using, you may experience problems with MR bulk load or indexes. Possible some other 'side effects' - try to search JIRAs for 'ROW TIMESTAMP". There is no way to alter the column type except drop/create this column. Thanks, Sergey On Mon, May 22

Re: Async Index Creation fails due to permission issue

2017-05-26 Thread Sergey Soldatov
try to create a directory which will be accessible for everyone (777) and point output directory there (like --output-path /temp/MYTABLE_GLOBAL_INDEX_HFILE). Could you also provide a bit more information whether you are using kerberos and versions of hdfs/hbase/phoenix. Thanks, Sergey On Tue, May

Re: Class org.apache.phoenix.mapreduce.bulkload.TableRowkeyPair not found

2017-06-01 Thread Sergey Soldatov
You may try to remove mapredcp and keep /etc/hbase/conf in the HADOOP_CLASSPATH. Thanks, Sergey On Thu, Jun 1, 2017 at 12:59 AM, cmbendre wrote: > Trying to bulk load CSV file on Phoenix 4.9.0 on EMR. > > Following is the command - > > /export HADOOP_CLASSPATH=$(hbase mapredcp):/usr/lib/hbase/

Re: Delete from Array

2017-06-06 Thread Sergey Soldatov
>From the Apache Phoenix documentation: - Partial update of an array is currently not possible. Instead, the array may be manipulated on the client-side and then upserted back in its entirety. Thanks, Sergey On Mon, Jun 5, 2017 at 7:25 PM, Cheyenne Forbes < cheyenne.osanu.for...@gmail.

Re: Large CSV bulk load stuck

2017-06-06 Thread Sergey Soldatov
Which version of Phoenix you are using? There were several bugs related to local index and CSV bulkload in 4.7 and 4.8 I believe. Another problem I remember is the RAM size for reducers. It may sound ridiculous, but using less may help. Thanks, Sergey On Fri, Jun 2, 2017 at 11:13 AM, cmbendre wr

Re: what kind of type in phoenix is suitable for mysql type text?

2017-06-14 Thread Sergey Soldatov
Text in mysql is stored out of the table. HBase for that has MOB (medium objects), but Phoenix doesn't support that at the moment. So, the only option is to use varchar, HBase by default allows you to have something like 10mb in a single KV pair, but you may change it using hbase.client.keyvalue.ma

Re: Cant run map-reduce index builder because my view/idx is lower case

2017-06-22 Thread Sergey Soldatov
You may try to build Phoenix with patch from PHOENIX-3710 applied.That should fix the problem, I believe. Thanks, Sergey On Mon, Jun 19, 2017 at 11:28 AM, Batyrshin Alexander <0x62...@gmail.com> wrote: > Hello again, > > Could you, please, hel

Re: Getting too many open files during table scan

2017-06-23 Thread Sergey Soldatov
You may check "Are there any tips for optimizing Phoenix?" section of Apache Phoenix FAQ at https://phoenix.apache.org/faq.html. It says how to pre-split table. In your case you may split on the first letters of client_id. When we are talking about monotonous data, we usually mean the primary key

Re: ArrayIndexOutOfBounds excpetion

2017-07-19 Thread Sergey Soldatov
Hi Siddharth The problem that was described in PHOENIX-3196 (as well as in PHOENIX-930 and several others) is that we sent metadata updates for the table before checking whether there is a problem with duplicated column names. I don't think you hit the same problem if you are using alter tables. Co

Re: Difference in response time for Join queries with a hint.(ResultSet.next() takes a lot of time )

2017-07-20 Thread Sergey Soldatov
Hi Siddharth, That's sounds strange because sqlline tool is just an another db client and it uses the same JDBC API. By any chance can you provide the DDLs and queries, so we will be able to reproduce the problem? Thanks, Sergey On Wed, Jul 19, 2017 at 11:16 PM, Siddharth Ubale < siddharth.ub...

Re: hash cache errors

2017-07-21 Thread Sergey Soldatov
Hi Mike, There are a couple reasons why it may happen: 1. server side cache expired. Time to live can be changed by phoenix.coprocessor.maxServerCacheTimeToLiveMs 2. Region has been moved to another region server where the join cache is missing. Look at https://issues.apache.org/jira/browse/PHOENI

Re: hash cache errors

2017-07-26 Thread Sergey Soldatov
otherwise legally protected > information of Iota IT, Inc. Any unauthorized distribution, use or > disclosure of this communication is strictly prohibited. If you have > received this communication in error, please notify the sender and delete > or otherwise destroy the e-mail and all at

Re: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: RELATIONSHIPDATA,,1501224108837.1fc1615f5be984e13329b31a902ebf44

2017-07-28 Thread Sergey Soldatov
Hi Siddharth, basing on the fact that it works fine with HBase 0.98 I can suggest that somehow you run into the partial row problem. To confirm that can you rebuild the client with following changes in phoenix-core/src/main/java/org/apache/phoenix/coprocessor/BaseScannerRegionObserver.java -

Re: Potential causes for very slow DELETEs?

2017-08-18 Thread Sergey Soldatov
Hi Pedro, Usually that kind of behavior should be reflected in the region server logs. Try to turn DEBUG level and check what exactly RS is doing during that time. Also you may check the thread dump of RS during the execution and see what are rpc handlers are doing. One thing that should be checke

Re: Phoenix csv bulkload tool not using data in rowtimestamp field for hbase timestamp

2017-08-27 Thread Sergey Soldatov
Hi Rahul, It seems that you run into https://issues.apache.org/jira/browse/PHOENIX-3406. You may apply the patch and rebuild Phoenix. Sorry, actually I thought that it's already integrated. Will revisit it again. Thanks, Sergey. On Sun, Aug 20, 2017 at 3:57 AM, rahuledavalath1 wrote: > Hi All,

Re: Phoenix CSV Bulk Load fails to load a large file

2017-09-06 Thread Sergey Soldatov
Do you have more details on the version of Phoenix/HBase you are using as well as how it hangs (Exceptions/messages that may help to understand the problem)? Thanks, Sergey On Wed, Sep 6, 2017 at 1:13 PM, Sriram Nookala wrote: > I'm trying to load a 3.5G file with 60 million rows using CsvBulkL

Re: Race condition around first/last value or group by?

2017-09-19 Thread Sergey Soldatov
Sounds like a bug. If you have a reproducible case (DDLs + sample data + query), please file a JIRA. Actually the number of cores should not affect the query execution. The behavior you described means that KV pair for the projection was incorrectly built and that may happen on the server side. Th

Re: SQLline and binary columns

2017-09-27 Thread Sergey Soldatov
Please check https://phoenix.apache.org/language/functions.html for functions that work with binary data. Like GET_BIT, GET_BYTE. Thanks, Sergey On Wed, Sep 27, 2017 at 2:46 PM, Jon Strayer wrote: > Is there a way to query a table based on a binary column (16 bytes)? > > > > — > > *Jon Strayer

Re: Performance of Inserting HBASE Phoenix table via Hive

2017-10-09 Thread Sergey Soldatov
You need to remember, that inserting into Phoenix from Hive is going through an additional layer (StorageHandler) which is not optimized like ORC or other Hive specific formats. So you may expect that it will be visible slower than regular Hive table and very slow comparing to the regular Phoenix u

Re: Phoenix 4.12 error on HDP 2.6

2017-10-25 Thread Sergey Soldatov
It would not work without patching and rebuilding sources using vendors artifacts. As Ted already mentioned you may consult the vendor. Thanks, Sergey On Wed, Oct 25, 2017 at 3:59 AM, Sumanta Gh wrote: > Hi, > I am trying to install phoenix-4.12.0 (HBase-1.1) on HDP 2.6.2.0. As per > installati

Re: Cloudera parcel update

2017-10-25 Thread Sergey Soldatov
Hi Flavio, It looks like you need to ask the vendor, not the community about their plan for further releases. Thanks, Sergey On Wed, Oct 25, 2017 at 2:21 PM, Flavio Pompermaier wrote: > Hi to all, > the latest Phoenix Cloudera parcel I can see is 4.7...any plan to release > a newer version? >

Re: Querying table with index fails in some configurations.

2017-10-30 Thread Sergey Soldatov
It's reproducible on my box which is maybe several days behind the master branch, so feel free to file a JIRA. Thanks, Sergey

Re: Particular column shows NA for certain queries, is populated for others

2017-10-30 Thread Sergey Soldatov
Does it reproducible only on large data sets? If so, then you possible hit PHOENIX-3112 bug. As a workaround, you may try to increase hbase.client.scanner.max.result.size property to something big (100Mb for example). Thanks, Sergey On Mon, Oct

Re: SELECT + ORDER BY vs self-join

2017-10-31 Thread Sergey Soldatov
I agree with James that this happens because the index was not involved because it doesn't cover all columns. I believe that in the second case, the RHT is using the index to create a list of rowkeys and they are used for point lookups by skipscan. bq. When is using the self-join a worse choice t

Re: how to update cache_size in SYSTEM."SEQUENCE"

2017-11-06 Thread Sergey Soldatov
Well, the regular way how to do that is : upsert into TABLE (pk1, pk2, pk3, column_to_change) select pk1, pk2, pk3, 'new_value' from TABLE where ... So you have to provide ALL columns for PK. Thanks, Sergey On Mon, Nov 6, 2017 at 8:04 AM, Adi Kadimetla wrote: > Hi Team, > > I need help to upd

Re: Tuning MutationState size

2017-11-09 Thread Sergey Soldatov
Could you provide the version you are using? Do you have autocommit turned on and have you changed the following properties: phoenix.mutate.batchSize phoenix.mutate.maxSize phoenix.mutate.maxSizeBytes Thanks, Sergey If you are using more recent version, than you may consider to On Thu, Nov 9, 201

Re: Transfer all data to a new phoenix cluster

2017-11-14 Thread Sergey Soldatov
You may use the standard procedures to copy HBase tables across the clusters (copyTable, snapshot based copy. There is a lot of articles on this topic). And execution time depends on the network speed. For 1Gbit it would take something close to 5-6 hours. If you don't want any downtime and plans c

Re: Pool size / queue size with thin client

2017-11-14 Thread Sergey Soldatov
Make sure that you have restarted PQS as well and it has the updated hbase-site.xml in the classpath. Thanks, Sergey On Tue, Nov 14, 2017 at 6:53 AM, Stepan Migunov < stepan.migu...@firstlinesoftware.com> wrote: > Hi, > > Could you please suggest how I can change pool size / queue size when > us

Re: Bulk loading into table vs view

2017-11-28 Thread Sergey Soldatov
Please take a look at https://phoenix.apache.org/views.html All views are 'virtual' tables, so they don't have a dedicated physical table and operates on top of the table that is specified in the view DDL. Thanks, Sergey On Sat, Nov 25, 2017 at 6:25 AM, Eisenhut, Roman wrote: > Dear Phoenix-Tea

Re: Reading Phoenix Upserted data directly from Hbase

2017-12-01 Thread Sergey Soldatov
HBase doesn't know about data types that you are using in Phoenix. So it operates with binary arrays. HBase shell shows printable ASCII characters as is and hex values for the rest. You may use spark-phoenix module to work with Phoenix from Spark. Thanks, Sergey On Thu, Nov 30, 2017 at 11:22 PM,

Re: Why the bulkload does not support update index data?

2018-01-22 Thread Sergey Soldatov
What do you mean by 'bulkload can not update index data'? During the bulkload MR job creates hfiles for the table and all corresponding indexes and uses the regular HBase bulkload to load them. Have you had a problem during HBase bulkload for index generate hfile? Thanks, Sergey On Fri, Jan 19, 2

Re: High CPU usage on Hbase region Server with GlobalMemoryManager warnings

2018-02-01 Thread Sergey Soldatov
that kind of messages may happen when there were queries that utilize memory manager (usually joins and group by) and they were timed out or failed due to some reason. So the message itself is hardly related to CPU usage or GC. BUT. That may mean that your region servers are unable to handle proper

Re: Changing number of salt buckets for a table

2018-02-15 Thread Sergey Soldatov
Well, there is no easy way to resalt the table. The main problem that when salting byte is calculated, the number of buckets is used. So if we want to change the number of buckets, all rowkeys should be rewritten. I think that you still can use MR job for that, but I would recommend to write data t

Re: Incorrect number of rows affected from DELETE query

2018-02-22 Thread Sergey Soldatov
Hi Jins, If you provide steps to reproduce it would be much easier to understand where the problem is. If nothing was deleted the report should be 'No rows affected'. Thanks, Sergey On Mon, Feb 19, 2018 at 4:30 PM, Jins George wrote: > Hi, > > I am facing an issue in which the number of rows af

Re: A strange issue about missing data

2018-03-29 Thread Sergey Soldatov
Usually such kind of problems may happen when something wrong with the statistic. You may try to clean SYSTEM.STATS, restart the client and check whether it fixes the problem. If not, you may try to turn on DEBUG log level on the client and check whether generated scans are covering all regions (th

Re: Timestamp retreiving in Phoenix

2018-03-29 Thread Sergey Soldatov
The general answer is no. In some cases, row timestamp feature https://phoenix.apache.org/rowtimestamp.html may be useful. But still, you should have timestamp column in your table DDL in that case. Thanks, Sergey On Thu, Mar 29, 2018 at 1:14 AM, alexander.scherba...@yandex.com < alexander.scherb

Re: SALT_BUCKETS and Writing

2018-04-04 Thread Sergey Soldatov
Salt is calculated basing on the hash of the row key and number of buckets on phoenix client side. So it's preferable to use phoenix to write. If you have to use HBase API because of any reason, you need to perform the same calculations (copy/paste the code from Phoenix) Thanks, Sergey On Wed, Ap

Re: intermittent problem to query simple table

2018-04-06 Thread Sergey Soldatov
There is the JIRA on this topic : 1. PHOENIX-4366 2. 3. Thanks, 4. Sergey On Thu, Apr 5, 2018 at 11:42 AM, Xu, Nan wrote: > Hi, > > > > Env: hbase-1.1.4 > > Phoenix: 4.10 > >I am querying a very simple table in phoenix >

Re: hbase cell storage different bewteen bulk load and direct api

2018-04-18 Thread Sergey Soldatov
Hi Lew, no. 1st one looks line incorrect. You may file a bug on that ( I believe that the second case is correct, but you may also check with uploading data using regular upserts). Also, you may check whether the master branch has this issue. Thanks, Sergey On Thu, Apr 19, 2018 at 10:19 AM, Lew J

Re: phoenix query server java.lang.ClassCastException for BIGINT ARRAY column

2018-04-18 Thread Sergey Soldatov
Could you please be more specific? Which version of phoenix are you using? Do you have a small script to reproduce? At first glance it looks like a PQS bug. Thanks, Sergey On Thu, Apr 19, 2018 at 8:17 AM, Lu Wei wrote: > Hi there, > > I have a phoenix table containing an BIGINT ARRAY column. Bu

Re: hbase cell storage different bewteen bulk load and direct api

2018-04-19 Thread Sergey Soldatov
rs the same as the psql results - i.e. extra > cells. I will try the master branch next. Thanks for the tip. > > -- Original Message -- > From: Sergey Soldatov > To: user@phoenix.apache.org > Subject: Re: hbase cell storage different bewteen bulk load and d

Re: hint to use a global index is not working - need to find out why

2018-04-19 Thread Sergey Soldatov
That looks strange. Could you please provide full DDLs for table and indexes? I just tried a similar scenario and obviously index is used: 0: jdbc:phoenix:> create table VARIANTJOIN_RTSALTED24 (id integer primary key, chrom_int integer, genomic_range integer); No rows affected (6.339 seconds) 0: j

Re: 答复: phoenix query server java.lang.ClassCastException for BIGINT ARRAY column

2018-04-19 Thread Sergey Soldatov
; create table if not exists testarray(id bigint not null, events bigint > array constraint pk primary key (id)) > > > -- upsert data: > > upsert into testarray values (1, array[1,2]); > > > -- query: > > select id from testarray; -- fine > > select * from te

Re: Phoenix Client threads

2018-05-22 Thread Sergey Soldatov
Salting byte is calculated using a hash function for the whole row key (using all pk columns). So if you are using only one of PK columns in the WHERE clause, Phoenix is unable to identify which salting byte (bucket number) should be used, so it runs scans for all salting bytes. All those threads

Re: Atomic UPSERT on indexed tables

2018-06-04 Thread Sergey Soldatov
Yes, the documentation doesn't reflect the recent changes. Please see https://issues.apache.org/jira/browse/PHOENIX-3925 Thanks, Sergey On Fri, Jun 1, 2018 at 5:39 PM, Miles Spielberg wrote: > From https://phoenix.apache.org/atomic_upsert.html: > > Although global indexes on columns being atomi

Re: Phoenix CsvBulkLoadTool fails with java.sql.SQLException: ERROR 103 (08004): Unable to establish connection

2018-08-20 Thread Sergey Soldatov
If I read it correctly you are trying to use Phoenix and HBase that were built against Hadoop 2 with Hadoop 3. Is HBase was the only component you have upgraded? Thanks, Sergey On Mon, Aug 20, 2018 at 1:42 PM Mich Talebzadeh wrote: > Here you go > > 2018-08-20 18:29:47,248 INFO [main] zookeepe

Re: TTL on a single column family in table

2018-09-04 Thread Sergey Soldatov
What is the use case to set TTL only for a single column family? I would say that making TTL table wide is a mostly technical decision because in relational databases we operate with rows and supporting TTL for only some columns sounds a bit strange. Thanks, Sergey On Fri, Aug 31, 2018 at 7:43 AM

Re: SKIP_SCAN on variable length keys

2018-09-04 Thread Sergey Soldatov
SKIP SCAN doesn't use FuzzyRowFilter. It has its own SkipScanFilter. If you see problems, please provide more details or file a JIRA for that. Thanks, Sergey On Wed, Aug 29, 2018 at 2:17 PM Batyrshin Alexander <0x62...@gmail.com> wrote: > Hello, > Im wondering is there any issue with SKIP SCAN

Re: Salting based on partial rowkeys

2018-09-14 Thread Sergey Soldatov
Thomas is absolutely right that there will be a possibility of hotspotting. Salting is the mechanism that should prevent that in all cases (because all rowids are different). The partitioning described above actually can be implemented by using id2 as a first column of the PK and using presplit by

Re: ABORTING region server and following HBase cluster "crash"

2018-09-14 Thread Sergey Soldatov
That was the real problem quite a long time ago (couple years?). Can't say for sure in which version that was fixed, but now indexes has a priority over regular tables and their regions open first. So by the moment when we replay WALs for tables, all index regions are supposed to be online. If you

Re: ABORTING region server and following HBase cluster "crash"

2018-09-14 Thread Sergey Soldatov
4, 2018 at 4:04 PM Sergey Soldatov wrote: > That was the real problem quite a long time ago (couple years?). Can't say > for sure in which version that was fixed, but now indexes has a priority > over regular tables and their regions open first. So by the moment when we > replay

Re: Issue with Restoration on Phoenix version 4.12

2018-09-14 Thread Sergey Soldatov
If you exported tables from 4.8 and importing them to the preexisting tables in 4.12, make sure that you created tables using COLUMN_ENCODED_BYTES = 0 or have phoenix.default.column.encoded.bytes.attrib set to 0 in hbase-site.xml. I believe that the problem you see is the column name encoding that

Re: ABORTING region server and following HBase cluster "crash"

2018-09-15 Thread Sergey Soldatov
not configured this: > > hbase.region.server.rpc.scheduler.factory.class > = org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory > > Can this misconfiguration leads to our problems? > > On 15 Sep 2018, at 02:04, Sergey Soldatov > wrote: > > That was the real problem quite a long time a

Re: IllegalStateException: Phoenix driver closed because server is shutting down

2018-09-19 Thread Sergey Soldatov
That might be a misleading message. Actually, that means that JVM shutdown has been triggered (so runtime has executed the shutdown hook for the driver and that's the only place where we set this message) and after that, another thread was trying to create a new connection. Thanks, Sergey On Wed,