Cant run map-reduce index builder because my view/idx is lower case

2017-06-16 Thread Batyrshin Alexander
 Hello,
Im trying to build ASYNC index by example from 
https://phoenix.apache.org/secondary_indexing.html 

My issues is that my view name and index name is lower case, so map-reduce rise 
error:

2017-06-17 03:45:56,506 ERROR [main] index.IndexTool: An exception occurred 
while performing the indexing job: IllegalArgumentException:  
INVOICES_V4_INDEXED_FUZZY_IDX is not an index table for 
INVOICES_V4_INDEXED_FUZZY



Re: Phoenix Upgrade compatibility

2017-06-16 Thread Michael Young
Updating my question here:

I found some info from a post on May 19th under (Upgrade Phoenix version on
HDP 2.4).

It seems for those using HDP 2.6, we are stuck with Phoenix 4.7 until the
next Hortonworks release. :(

To any Hortonworks people out there: Is there any way we could do a full
install of Phoenix 4.10 (essentially from scratch, not a rolling upgrade or
just replacing the jars), and get it to work in HDP 2.6?  I would think so,
but we don't want to live too much on the edge, just enough to not wait too
long. :)

Michael


On Fri, Jun 16, 2017 at 5:01 PM, Michael Young  wrote:

> According to https://phoenix.apache.org/upgrading.html
>
> "Phoenix maintains backward compatibility across at least two minor
> releases to allow for *no downtime* through server-side rolling restarts
> upon upgrading. See below for details"
>
> We would like to upgrade from version 4.7 to a newer version.
>
> Any recommendations on which version we should get (latest?) and if we
> should do an incremental upgrade or move directly to the latest 4.10?
>
> Thanks,
> Michael
>
>


Phoenix Upgrade compatibility

2017-06-16 Thread Michael Young
According to https://phoenix.apache.org/upgrading.html

"Phoenix maintains backward compatibility across at least two minor
releases to allow for *no downtime* through server-side rolling restarts
upon upgrading. See below for details"

We would like to upgrade from version 4.7 to a newer version.

Any recommendations on which version we should get (latest?) and if we
should do an incremental upgrade or move directly to the latest 4.10?

Thanks,
Michael


Getting too many open files during table scan

2017-06-16 Thread Michael Young
We are running a 13-node hbase cluster.  One table uses 78 SALT BUCKETS
which seems to work reasonable well for both read and write.  This table
has 130 columns with a PK having 30 columns (fairly wide table).

However, after adding several new tables we are seeing errors about too
many open files when running a full table scan.


Caused by: org.apache.phoenix.exception.PhoenixIOException: Too many open
files
at
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
at
org.apache.phoenix.iterate.SpoolingResultIterator.(SpoolingResultIterator.java:152)
at
org.apache.phoenix.iterate.SpoolingResultIterator.(SpoolingResultIterator.java:84)
at
org.apache.phoenix.iterate.SpoolingResultIterator.(SpoolingResultIterator.java:63)
at
org.apache.phoenix.iterate.SpoolingResultIterator$SpoolingResultIteratorFactory.newIterator(SpoolingResultIterator.java:79)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:112)
at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:103)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Too many open files
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2024)
at
org.apache.phoenix.shaded.org.apache.commons.io.output.DeferredFileOutputStream.thresholdReached(DeferredFileOutputStream.java:176)
at
org.apache.phoenix.iterate.SpoolingResultIterator$1.thresholdReached(SpoolingResultIterator.java:116)
at
org.apache.phoenix.shaded.org.apache.commons.io.output.ThresholdingOutputStream.checkThreshold(ThresholdingOutputStream.java:224)
at
org.apache.phoenix.shaded.org.apache.commons.io.output.ThresholdingOutputStream.write(ThresholdingOutputStream.java:92)
at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
at
org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
at
org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
at org.apache.phoenix.util.TupleUtil.write(TupleUtil.java:149)
at
org.apache.phoenix.iterate.SpoolingResultIterator.(SpoolingResultIterator.java:127)
... 10 more


When running an explain plan:
explain select count(1) from MYBIGTABLE

+--+
|
PLAN   |
+--+
| CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL 78-WAY FULL
SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP  |
| ROW TIMESTAMP FILTER [0,
9223372036854775807)
|
| SERVER FILTER BY FIRST KEY
ONLY
|
| SERVER AGGREGATE INTO SINGLE
ROW
|
+--+

I has a lot of chunks.  Normally this query would return at least some
result after running for a few minutes.  With appropriate filters in the
WHERE clause, the queries run fine.

Any suggestions on how to avoid this error and get better performance from
the table scans?  Realizing that we don't need to run full table scans
regularly, just trying to understand better best practices for Phoenix
Hbase.

Thank you,
Michael


Re: Best strategy for UPSERT SELECT in large table

2017-06-16 Thread James Taylor
Hi Pedro,
Before 4.10, it will be single client (though multi threaded). With 4.10
and above, the statement would run distributed across your cluster so
performance should improve. Note that if the source table is taking writes
while the UPSERT SELECT is running, the statement would miss those writes.

Another alternative would be to write your own MR job to do the population.

Thanks,
James

On Fri, Jun 16, 2017 at 7:51 AM Pedro Boado  wrote:

> Hi guys,
>
> We are trying to populate a Phoenix table based on a 1:1 projection of
> another table with around 15.000.000.000 records via an UPSERT SELECT in
> phoenix client. We've noticed a very poor performance ( I suspect the
> client is using a single-threaded approach ) and lots of issues with client
> timeouts.
>
> Is there a better way of approaching this problem?
>
> Cheers!
> Pedro
>


Best strategy for UPSERT SELECT in large table

2017-06-16 Thread Pedro Boado
Hi guys,

We are trying to populate a Phoenix table based on a 1:1 projection of
another table with around 15.000.000.000 records via an UPSERT SELECT in
phoenix client. We've noticed a very poor performance ( I suspect the
client is using a single-threaded approach ) and lots of issues with client
timeouts.

Is there a better way of approaching this problem?

Cheers!
Pedro


Re: how to modify column in phoenix?

2017-06-16 Thread rafa
Hi,

As far as I know there is no support for that in Phoenix.

James Taylor explained an alternative to accomplish that in this thread:

https://mail-archives.apache.org/mod_mbox/phoenix-user/201610.mbox/browser

Regards,
rafa


On Fri, Jun 16, 2017 at 10:33 AM, 曾柏棠  wrote:

> hi,
>I want to expand one column in my phoenix table, but I can not find any
> words about how to modify column in phoenix reference,
>  so Is phoenix can do something like *alter table modify some_column
> varchar(100)?*
>
> *thanks!*
>


how to modify column in phoenix?

2017-06-16 Thread ??????
hi,   I want to expand one column in my phoenix table, but I can not find any 
words about how to modify column in phoenix reference,
 so Is phoenix can do something like alter table modify some_column 
varchar(100)?


thanks!