Thanks Thomas,

My table has about 10 billion rows with about 12 columns.

-----Original Message-----
From: Thomas D'Silva [mailto:[email protected]] 
Sent: Wednesday, July 22, 2015 12:51 PM
To: [email protected]
Subject: Re: How fast is upsert select?

Zack,

It depends on how wide the rows are in your table.  On a 8 node
cluster,   creating an index with 3 columns (char(15),varchar and
date) on a 1 billion row table takes about 1 hour 15 minutes.
How many rows does your table have and how wide are they?

On Wed, Jul 22, 2015 at 8:29 AM, Riesland, Zack <[email protected]> 
wrote:
> Thanks Ravi,
>
>
>
> I think I may not have IndexTool in my version of Phoenix.
>
>
>
> I’m calling:
> HADOOP_CLASSPATH=/usr/hdp/current/hbase-master/conf/:/usr/hdp/current/
> hbase-master/lib/hbase-protocol.jar
> hadoop jar /usr/hdp/current/phoenix-client/phoenix-client.jar
> org.apache.phoenix.mapreduce.index.IndexTool
>
>
>
> And getting a java.lang.ClassNotFoundException:
> org.apache.phoenix.mapreduce.index.IndexTool
>
>
>
>
>
>
>
> From: Ravi Kiran [mailto:[email protected]]
> Sent: Wednesday, July 22, 2015 10:36 AM
> To: [email protected]
> Subject: Re: How fast is upsert select?
>
>
>
> Hi ,
>
>
>
>    Since you are saying billions of rows, why don't you try out the 
> MapReduce route to speed up the process.  You can take a look at how
> IndexTool.java(https://github.com/apache/phoenix/blob/359c255ba6c67d01
> a810d203825264907f580735/phoenix-core/src/main/java/org/apache/phoenix
> /mapreduce/index/IndexTool.java) was written as it does a similar task 
> of reading from a Phoenix table and writes the data into the target 
> table using bulk load.
>
>
>
>
>
> Regards
>
> Ravi
>
>
>
> On Wed, Jul 22, 2015 at 6:23 AM, Riesland, Zack 
> <[email protected]>
> wrote:
>
> I want to play with some options for splitting a table to  test performance.
>
>
>
> If I were to create a new table and perform an upsert select * to the 
> table, with billions of rows in the source table, is that like an 
> overnight operation or should it be pretty quick?
>
>
>
> For reference, we have 6 (beefy) region servers in our cluster.
>
>
>
> Thanks!
>
>
>
>

Reply via email to