I had a very large Hive table that I needed in HBase.
After asking around, I came to the conclusion that my best bet was to:
1 - export the hive table to a CSV 'file'/folder on the HDFS
2 - Use the org.apache.phoenix.mapreduce.CsvBulkLoadTool to import the data.
I found that if I tried to pass t
For #2: hbase org.apache.hadoop.hbase.mapreduce.RowCounter
On Mon, Jun 22, 2015 at 11:34 AM, Riesland, Zack
wrote:
> I had a very large Hive table that I needed in HBase.
>
>
>
> After asking around, I came to the conclusion that my best bet was to:
>
>
>
> 1 – export the hive table to a CSV
Hive can connect to HBase and insert directly into any direction.
Don't know if it also works via Phoenix...
Counting is too slow on a single threaded job /command line - you should
write a map-reduce job, with some filter to load just the key this being
really fast.
A Map-reduce job is also the
For#2: You can use Row_Counter mapreduce job of HBase to count rows of
large table. You dont need to write any code.
Here is the sample command to invoke:
hbase org.apache.hadoop.hbase.mapreduce.RowCounter
~Anil
On Mon, Jun 22, 2015 at 12:08 PM, Ciureanu Constantin <
ciureanu.constan...@gmail.c
Hi Zack,
Would it be possible to provide a few more details on what kinds of
failures that you're getting, both with the CsvBulkLoadTool, and with the
"SELECT COUNT(*)" query?
About question #1, there aren't any known bugs (that I'm aware of) that
would cause some records to go missing in the Csv
’d love to be able to ingest the entire data set over night.
It’s clear that I’m missing quite a bit of data and I’m going to have to start
over with this table…
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Tuesday, June 23, 2015 2:57 AM
To: user@phoenix.apache.org
Subject: Re: How
gt;
>
>
>
>
>
> I understand your comments about determining whether there are any failed
> map or reduce operations. I watched each one in the application master GUI
> and didn’t notice any that failed.
>
>
>
> Finally, I understand your point about how the
ispatch(SqlLine.java:821)
>>
>> at sqlline.SqlLine.begin(SqlLine.java:699)
>>
>> at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
>>
>> at sqlline.SqlLine.main(SqlLine.java:424)
>>
>>
>>
>> I am running the q
t or hints you might have.
I’d love to be able to ingest the entire data set over night.
It’s clear that I’m missing quite a bit of data and I’m going to have to start
over with this table…
From: Gabriel Reid
[mailto:gabriel.r...@gmail.com<mailto:gabriel.r...@gmail.com>]
Sent: Tuesday, June 23,