ookeeper Zookeeper quorum to connect to (optional)
> -it,–index-table Index table name to load (optional)
>
>
>
>
> From: Gabriel Reid <g...@gmail.com>
> Subject: Re: Error with lines ended with backslash when Bulk Data Loading
> Date: 2016-12-09 02:06 (+0800)
>
Hi
Backslash is the default escape character that is used for parsing CSV
data when running a bulk import, so it has a special meaning.
You can supply a different (custom) escape character with the -e or
--escape flag on the command line so that parsing your CSV files that
include backslashes
Hi Zack,
My initial gut feeling is that this doesn't have anything to do with the
commas in the input data, but it looks like instead the pipe separator
isn't being taken into account. Has this been working for you with other
data files?
I've got more questions than answers to start with:
*
Hi Zack,
Am I correct in understanding the the files are under a structure like
x/.deflate/csv_file.csv ?
In that case, I believe everything under the .deflate directory will
simply be ignored, as directories whose name start with a period are
considered "hidden" files.
However, assuming the
>
>
>
> R’s
>
> Ravi Kumar B
>
>
>
> *From:* Gabriel Reid [mailto:gabriel.r...@gmail.com]
> *Sent:* Wednesday, September 28, 2016 5:51 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: Loading via MapReduce, Not Moving HFiles to HBase
>
>
>
> H
Hi Ravi,
It looks like those log file entries you posted are from a mapreduce task.
Could you post the output of the command that you're using to start the
actual job (i.e. console output of "hadoop jar ...").
- Gabriel
On Wed, Sep 28, 2016 at 1:49 PM, Ravi Kumar Bommada
er way?
>
> Best Regards,
> ANKIT BEOHAR
>
>
> On Wed, Aug 24, 2016 at 1:31 PM, Gabriel Reid <gabriel.r...@gmail.com>
> wrote:
>>
>> Hi Ankit,
>>
>> All data stored in HBase is stored in the form of byte arrays. The
>> conversion from richer
Hi Ankit,
All data stored in HBase is stored in the form of byte arrays. The
conversion from richer types (e.g. date) to byte arrays is one of the
(many) functionalities included in Phoenix.
When you add a date value in the form of a string to HBase directly
(bypassing Phoenix), you're simply
Hi Aaron,
I feel like I've seen this one before, but I'm not totally sure.
What I would look at first is a possible hbase-xxx version issue.
Something along these lines that I've seen in the past is that another
uber-jar JDBC driver that is on the classpath also contains hbase or
zookeeper
> suspect the insert does MapReduce as well or is there some other mechanism
> that would scale?
> (8) Compaction/Statistics Operation on Aggregate Table
>
> I really appreciate all the support. We are trying to run a Phoenix TPCH
> benchmark and are struggling a bit to unders
all(ScannerCallableWithReplicas.java:350)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:324)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>
Hi Aaron,
I'll answered your questions directly first, but please see the bottom
part of this mail for important additional details.
You can specify the
"hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily" parameter
(referenced from your StackOverflow link) on the command line of you
Hi Radha,
This looks to me as if there is an issue in your data somewhere past
the first 100 records. The bulk loader isn't supposed to fail due to
issues like this. Instead, it's intended to simply report the problem
input lines and continue on, but it appears that this isn't happening.
Could
> Thanks,I did't found fs.defaultFS property be overwritten . And I have
>>> change to use pig to load table data into Phoenix.
>>>
>>> 2016-05-11 14:23 GMT+08:00 Gabriel Reid <gabriel.r...@gmail.com>:
>>>
>>>> Another idea: could you check in
>
/home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:
>>
>> /home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:
>> /home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:
>>
>> /home/dcos/hadoop-2.
parallelism.
> But, how would it impact the aggregate queries?
>
> Vamsi Attluri
>
> On Wed, Mar 16, 2016 at 9:06 AM Gabriel Reid <gabriel.r...@gmail.com> wrote:
>>
>> Hi Vamsi,
>>
>> The first thing that I notice looking at the info that you've posted
Hi Vamsi,
The first thing that I notice looking at the info that you've posted
is that you have 13 nodes and 13 salt buckets (which I assume also
means that you have 13 regions).
A single region is the unit of parallelism that is used for reducers
in the CsvBulkLoadTool (or HFile-writing
Hi Vamsi,
I can't answer your question abotu the Phoenix-Spark plugin (although
I'm sure that someone else here can).
However, I can tell you that the CsvBulkLoadTool does not write to the
WAL or to the Memstore. It simply writes HFiles and then hands those
HFiles over to HBase, so the memstore
Hi Zack,
If bulk loading is currently slow or error prone, I don't think that
this approach would improve the situation.
>From what I understand from that link, this is a way to copy the
contents of a Hive table into HFiles. Hive operates via mapreduce
jobs, so this is technically a map reduce
Hi Parth,
Setting the "fs.permissions.umask-mode" config setting to "000" should
do the job. You can do this in your hadoop config on the machine where
you're submitting the job, or just supply it as the leading
command-line parameter as follows:
hadoop jar phoenix-client.jar
gupt...@gmail.com>
>>> wrote:
>>>>
>>>> I dont see 4.5.3 release over here:
>>>> http://download.nextag.com/apache/phoenix/
>>>> Is 4.5.3 not released yet?
>>>>
>>>> On Wed, Dec 30, 2015 at 11:14 AM, anil gupta <anil
On Fri, Dec 18, 2015 at 9:35 PM, Cox, Jonathan A wrote:
>
> The Hadoop version is 2.6.2.
>
I'm assuming the reduce phase is failing with the OOME, is that correct?
Could you run "jps -v" to see what the full set of JVM parameters are
for the JVM that is running the task that
On Fri, Dec 18, 2015 at 4:31 PM, Riesland, Zack
wrote:
> We are able to ingest MUCH larger sets of data (hundreds of GB) using the
> CSVBulkLoadTool.
>
> However, we have found it to be a huge memory hog.
>
> We dug into the source a bit and found that
>
Hi Jonathan,
Sounds like something is very wrong here.
Are you running the job on an actual cluster, or are you using the
local job tracker (i.e. running the import job on a single computer).
Normally an import job, regardless of the size of the input, should
run with map and reduce tasks that
Hi Jonathan,
It looks like this is a bug that was relatively recently introduced in
the bulk load tool (i.e. that the exit status is not correctly
reported if the bulk load fails). I've logged this as a jira ticket:
https://issues.apache.org/jira/browse/PHOENIX-2538.
This means that for now,
two different ways (just to be sure):
>
> export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx48g"
>
> and also in mapred-site.xml:
>
> mapred.child.java.opts
> -Xmx48g
>
>
> -Original Message-
> From: Gabriel Reid [mailto:gabr
Hi Afshin,
That looks like a bug to me, although I'm not too confident about
coming up with a good fix for it.
There isn't any handling in the bulk load tool for multiple updates to
the same row in a single input. Basically, the code assumes that a
single given row is only included once in any
This looks like an incompatibility between HBase versions (i.e.
between the version that Phoenix is built against, and the version
that you've got installed on your cluster).
The reason that the bulk loader and fat client are causing issues is
that they include the linked versions of the hbase
Indeed, this was a regression. It has since been fixed in PHOENIX-1277 [1],
and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
- Gabriel
1. https://issues.apache.org/jira/browse/PHOENIX-1277
On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 wrote:
> Hi,
>
> In Phoenix-4.3.0 or later
Re-adding the user list, which I accidentally left off.
On Wed, Nov 18, 2015 at 3:55 PM, Gabriel Reid <gabriel.r...@gmail.com> wrote:
> Yes, I believe that's correct, if you change the umask you make the
> HFiles readable to all during creation.
>
> I believe that the alternat
t worked now.. I hope this is the correct
> thing to do ?
>
> conf.set("fs.permissions.umask-mode", "000");
>
>
> Thanks Again
>
> Sanooj
>
> On Wed, Nov 18, 2015 at 12:29 AM, Gabriel Reid <gabriel.r...@gmail.com>
> wrote:
>>
>>
On Thu, Oct 29, 2015 at 6:33 PM, James Taylor wrote:
> I seem to remember you starting down that path, Gabriel - a kind of
> pluggable transformation for each row. It wasn't pluggable on the input
> format, but that's a nice idea too, Ravi. I'm not sure if this is what
gt;/_SUCCESS
> 15/10/23 06:02:19 WARN hbase.HBaseConfiguration: Config option
> "hbase.regionserver.lease.period" is deprecated. Instead, use
> "hbase.client.scanner.timeout.period"
> 15/10/23 06:02:19 INFO util.ChecksumType: Checksum using
> org.apache.hadoop.util.Pur
Hi Zack,
I can't give you any information about compatibility of a given
Phoenix version with a given version of HDP (because I don't know).
However, could you give a bit more info on what you're seeing? Are all
import jobs failing with this error for a given set of tables? Or is
this a random
Hi Zack,
It looks like there is probably an older version of HBase somewhere
(earlier) in the classpath.
I don't know anything about Aqua Data Studio, but could it be that it
somehow bundles support for HBase 0.94 somewhere (or perhaps there is
another JDBC driver on the class path that workds
doop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
> at
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:106)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.
Hi Gaurav,
Looking at your DDL statement, I'm guessing that your table is
currently made up of 33 regions, which means that the time to do a
full count query will take at least as long as it takes to count 27
million rows with a single thread (900 million threads divided by 33
regions).
The
Hi Zack,
I've never actually tried it, but I don't see any reason why it
shouldn't work. Starting the job with the parameter
-Dmapreduce.job.queuename= should do the
trick, assuming everything else is set up.
- Gabriel
On Thu, Sep 24, 2015 at 7:05 PM, Riesland, Zack
Hi Dhruv,
This is a bug in Phoenix, although it appears that your hadoop
configuration is also somewhat unusual.
As far as I can see, your hadoop configuration is set up to use the
local filesystem, and not hdfs. You can test this by running the
following command:
hadoop dfs -ls /
If that
13944e4ab1195c70ff530ee
>> first=\x80\x00\x00\x00\x00\x0009 last=\x80\x00\x00\x00\x00\x01\x092
>
>
> That's why we explicitly provided ---output as hdfs:// and things atleast
> worked.
>
> --
> Dhruv
>
> On Wednesday 23 September 2015 06:54 PM, Gabriel Reid wrote
anade <gaurav.kan...@gmail.com>
> wrote:
>
>> Thanks for the pointers Gabriel! Will give it a shot now!
>>
>> On 16 September 2015 at 12:15, Gabriel Reid <gabriel.r...@gmail.com>
>> wrote:
>>
>>> Yes, there is post-processing that g
ors "Added a key not lexically larger than previous
> key"
>
> Thanks a lot!
>
>
> On 15 September 2015 at 19:46, Gabriel Reid <gabriel.r...@gmail.com> wrote:
>>
>> The upsert statements in the MR jobs are used to convert data into the
>> appro
t;
> Am I missing something simple ?
>
> Thanks
> Gaurav
>
>
> On 12 September 2015 at 11:16, Gabriel Reid <gabriel.r...@gmail.com> wrote:
>>
>> Around 1400 mappers sounds about normal to me -- I assume your block
>> size on HDFS is 128 MB, which works out to
95934997
> File Output Format Counters
> Name
> Map
> Reduce
> Total
> Bytes Written
> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCoun
The upsert statements in the MR jobs are used to convert data into the
appropriate encoding for writing to an HFile -- the data doesn't actually
get pushed to Phoenix from within the MR job. Instead, the created
KeyValues are extracted from the "output" of the upsert statement, and the
statement
There isn't a command available to rename or change the data type of a
column in Phoenix -- to do something like this, you need to drop a
column and then create a new column.
If you have existing data that you want to migrate, I would suggest
doing the following:
1. Create the new column (with
Around 1400 mappers sounds about normal to me -- I assume your block
size on HDFS is 128 MB, which works out to 1500 mappers for 200 GB of
input.
To add to what Krishna asked, can you be a bit more specific on what
you're seeing (in log files or elsewhere) which leads you to believe
the data
On Tue, Sep 1, 2015 at 3:04 AM, Behdad Forghani wrote:
> In my experience the fastest way to load data is directly write to HFile. I
> have measured a performance gain of 10x. Also, if you have binary data or
> need to escape characters HBase bulk loader does not escape
On Tue, Sep 1, 2015 at 11:29 AM, Riesland, Zack
wrote:
> You say I can find information about spills in the job counters. Are you
> talking about “failed” map tasks, or is there something else that will help
> me identify spill scenarios?
"Spilled records" is a counter
-12 10:20:21.125 directly at Phoenix?
Thanks
בתאריך 18 באוג׳ 2015 21:48, Gabriel Reid gabriel.r...@gmail.com כתב:
Ok, thanks for those snippets -- I think that's enough to explain what is
happening. The biggest cause for confusion here is probably the way that
sqlline retrieves values from
Filtering a query on the leading columns of the primary key (i.e. [A],
[A,B], or [A,B,C]) will give optimal performance. This is because the
records are in sorted order based on the combination of [A,B,C], so
filtering on a leading subset of the primary key is basically the same as
filtering on
Hi Tom,
I've tried your SQL statements with 4.3.1, and the initial one does indeed
work in 4.3.1 and later. The view definition that includes a CAST statement
still fails (due to an bug in CastParseNode for which I'll post a patch
shortly).
By the way, the way I tested this (and an easy way to
Are you supplying the -z parameter with the ZooKeeper quorum? This will be
necessary if ZooKeeper isn't running on the localhost and/or isn't
configured in the local configuration (see
http://phoenix.apache.org/bulk_dataload.html).
- Gabriel
On Mon, Jul 6, 2015 at 12:08 PM Riesland, Zack
Hi Zack,
No, you don't need to worry about the name of the primary key getting in
the way of the rows being added.
Like Anil pointed out, the best thing to look at first is the job counters.
The relevant ones for debugging this situation are the total map inputs and
total map outputs, total
clear up a lot of confusion…
*From:* Gabriel Reid [mailto:gabriel.r...@gmail.com]
*Sent:* Thursday, June 25, 2015 2:44 PM
*To:* user@phoenix.apache.org
*Subject:* Re: Bug in CsvBulkLoad tool?
Hi Zack,
The job counters are available in the YARN resource manager and/or YARN
…
*From:* Gabriel Reid [mailto:gabriel.r...@gmail.com
gabriel.r...@gmail.com]
*Sent:* Tuesday, June 23, 2015 2:57 AM
*To:* user@phoenix.apache.org
*Subject:* Re: How To Count Rows In Large Phoenix Table?
Hi Zack,
Would it be possible to provide a few more details on what kinds
The default column family name is 0. This is the string containing the
character representation of zero (or in other words, a single byte with
value 48).
And yes, it's possible to read Phoenix tables using the HBase API (although
it's of course a lot easier if you go via Phoenix).
- Gabriel
On
Hi Zack,
Would it be possible to provide a few more details on what kinds of
failures that you're getting, both with the CsvBulkLoadTool, and with the
SELECT COUNT(*) query?
About question #1, there aren't any known bugs (that I'm aware of) that
would cause some records to go missing in the
/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java#L99-L100
On Sat, May 16, 2015 at 9:14 AM, Gabriel Reid gabriel.r...@gmail.com
wrote:
Hi Nick,
The date format is (if I'm not mistaken) ISO-8601, so I think you'll have
to format your date values as 1970-01-01.
- Gabriel
On Fri
Hi Nick,
The date format is (if I'm not mistaken) ISO-8601, so I think you'll have
to format your date values as 1970-01-01.
- Gabriel
On Fri, May 15, 2015 at 02:02 Nick Dimiduk ndimi...@gmail.com wrote:
Heya,
Giving the RC a spin, and also investigating the impact of HBASE-13604, I'm
Hi Kiru,
How many regions are there on this table?
Could you also share some information on the schema of the table (e.g. how
many columns are defined)?
Does a limit 10 query also hang in this table?
Could you also elaborate a bit on the issues you were running into when
loading data into the
Hi Siva,
Yes, that's pretty much correct -- TO_DATE is returning a Date value, which
has millisecond granularity -- the fact that you're only seeing a date
(with no time component) is due to the way in which the Date is formatted,
and not it's internal value.
- Gabriel
On Sun, May 3, 2015 at
Hi Siva,
The TO_DATE function returns a java.sql.Date value, and the string
representation of the java.sql.Date value is what you're seeing in your
sqlline session.
The internal long representation of the Date value coming out of TO_DATE
will represent the date to millisecond granularity
Hi Matt,
How are you viewing the timestamps (or in other words, how are you
verifying that they're not in GMT)?
The reason I ask is because internally in Phoenix, timestamps are used
without a timezone (they're just based on a long, as you've saved in your
table). However, the
That certainly looks like a bug. Would it be possible to make a small
reproducible test case and if possible, log this in the Phoenix JIRA (
https://issues.apache.org/jira/browse/PHOENIX) ?
- Gabriel
On Mon, Apr 6, 2015 at 6:10 PM Marek Wiewiorka marek.wiewio...@gmail.com
wrote:
Hi All,
I
Hi,
I believe Squirrel does some kind of implicit LIMIT 100 on statements, so
my guess would be that it's adding a LIMIT 100 to your UPSERT SELECT
statement.
Could you try the same thing using sqlline to verify if it's a problem
there as well?
- Gabriel
On Mon, Mar 30, 2015 at 12:28 PM Dark
You can't set auto-commit via the connection properties or JDBC url in
version 3.0.0. However, this is possible (via the AutoCommit property) as
of version 3.3 and 4.3.
Other client-side properties can be set via the Properties object passed in
to DriverManager.connect.
- Gabriel
On Wed, Mar
The correct syntax for a Phoenix JDBC url with a tenant id is as follows:
localhost:2181;TenantId=foo
Note that the TenantId parameter is capitalized (it's case-sensitive).
However (on Linux or Mac at least), it's not currently possible to connect
with a tenant-specific connection like this, as
client.HConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x24bd5223c471259
15/03/02 15:21:35 INFO zookeeper.ZooKeeper: Session: 0x24bd5223c471259
closed
15/03/02 15:21:35 INFO zookeeper.ClientCnxn: EventThread shut down
*From:* Gabriel Reid [mailto:gabriel.r...@gmail.com
way like Hbase
loader. What do you say, any thoughts on this?
Thanks,
Siva.
On Wed, Feb 11, 2015 at 11:34 PM, Gabriel Reid gabriel.r...@gmail.com
wrote:
Hi Siva,
If I understand correctly, you want to explicitly supply null values
in a CSV file for some fields. In general, this should
I'm not aware of a Cascading Tap for reading/writing to and from
Phoenix. Phoenix-specific InputFormat and OutputFormat implementations
were recently added to Phoenix, so if there's an easy way to wrap an
existing InputFormat and OutputFormat as a Tap in Cascading, then this
would probably be the
Hi Siva,
If I understand correctly, you want to explicitly supply null values
in a CSV file for some fields. In general, this should work by just
leaving the field empty in your CSV file. For example, if you have
three fields (id, first_name, last_name) in your CSV file, then a
record like
Hi Thanaphol,
Could you elaborate on how you're debugging this issue? The reason I ask is
that the JDBC Timestamp class does some of its own formatting when you
query it as a string (it formats the string to a timestamp in the local
timezone).
The general rules are as follows:
* the bulk loader
Hi Constantin,
The issues you're having sound like they're (probably) much more
related to MapReduce than to Phoenix. In order to first determine what
the real issue is, could you give a general overview of how your MR
job is implemented (or even better, give me a pointer to it on GitHub
or
Hi David,
The PhoenixConnection class is not thread-safe, and shouldn't be
shared over multiple threads. I think that this is probably the case
with quite a few other JDBC drivers as well, so it's generally safer
to use a JDBC connection pool if you want to use connections in
multiple threads.
Hi Marco,
Because the name of your 'e' column is not quoted in your DDL
statement, the column name 'e' internally gets upper-cased to 'E'. If
you store values under column-family 'events' and column qualifier
'E', they will show up in Phoenix queries.
Alternatively, you can change your DDL
)
= 1234567890
ll.to_byte_array
NoMethodError: undefined method 'to_byte_array' for 1234567890:Fixnum
-Original Message-
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Thursday, January 08, 2015 7:30 PM
To: user@phoenix.apache.org
Subject: Re: Numbers low-level format in Phoenix
Hi Kunal,
I think you'll need to post some additional information to get an
answer to your question. You said that MySQL returns 35 rows and
Phoenix returns 32 rows, but it's not clear from your description what
the rows are that are missing from the Phoenix result, or what it is
that makes the
Hi Noam,
It doesn't sound all that surprising that you're CPU bound on a batch
import job like this if you consider everything that is going on
within the mappers.
Let's say you're importing data for a table with 20 columns. For each
line of input data, the following is then occurring within the
Hi Noam,
I think that the things that most typically can affect MR loading
performance are:
* number of regions (as this affects the number of reducers used to
create the HFiles)
* amount of memory used for sort buffers
* use of compression on map output
With your 32-region salted table, it
for the answer we will look in to it and update
The impala is impala parquet table
-Original Message-
From: Gabriel Reid [mailto:gabriel.r...@gmail.com]
Sent: Tuesday, December 23, 2014 2:27 PM
To: user@phoenix.apache.org
Subject: Re: CSV bulk loading using map reduce
Hi Noam,
I think
Hi Vamsi,
Running upsert statements like that will indeed not work in Phoenix
(that grammar isn't supported).
What you're trying to accomplish is technically the same as executing
multiple upsert statements and then committing at the end of the
batch. This can be accomplished by running multiple
Hi,
This is an issue when using newer versions of HBase with MapReduce,
explained here:
https://hbase.apache.org/book.html#hbase.mapreduce.classpath
The specific command invocation that you need to get around this issue
is documented in the Loading via MapReduce section of this page:
character as separator.
On Thu, Dec 11, 2014 at 2:04 AM, Gabriel Reid gabriel.r...@gmail.com wrote:
I just discovered how to get this working properly (I had wrongly
assumed that simply supplying '\t' was sufficient).
In order to supply a tab character as the delimiter, you need to
supply
Hi Rama,
Could you give a bit more information, including:
* what is the exact full command that you're entering
* which version of Phoenix are you using
* what is your environment (i.e. which OS are you running on)
* what is the exact error (and/or stacktrace) that you're getting
Thanks,
Hi Rama,
Sorry, I lost track of this.
The steps to set up your environment to run mapreduce will depend on
which version of Hadoop you're using, as well as which distribution
(i.e. the base Apache release, CDH, HDP, or something else).
If you're running the base Apache release, then the docs
Thanks for pointing out PHOENIX-976 James (I had lost track of that
one), but I think that this is a different issue.
@Rama, I see you're running on Windows. Can you confirm that you're
able to start (non-Phoenix) MapReduce jobs from your Windows machine?
In any case, the configuration parameter
you'd approach generating
hfiles. Would you extend the csv bulk loader? How would you represent
dynamic columns in a csv? A general solution is also further complicated by
the fact that a dynamic column may have heterogeneous types.
-Bob
On Thursday, October 16, 2014 12:24 AM, Gabriel Reid
__
Ralph Perko
Pacific Northwest National Laboratory
On 10/9/14, 11:17 AM, Gabriel Reid gabriel.r...@gmail.com wrote:
Hi Ralph,
I think this depends a bit on how quickly you want to get the data
into Phoenix after it arrives, what kind
Hi,
You've got the usage of the command correct there, but the semi-colon
character has a special meaning in most shells. Wrapping it with
single quotes should resolve the issue, as follows:
./psql.py z1:/hbase -t NATION ../sample/NATION.csv -d ';'
- Gabriel
On Thu, Oct 9, 2014 at 5:26
Hi Noam,
Could you post the error message and/or stack trace you're getting
when Oozie says that a jar is missing or you don't have permission to
read it?
- Gabriel
On Sun, Oct 5, 2014 at 8:40 AM, Bulvik, Noam noam.bul...@teoco.com wrote:
Hi,
We are trying to do periodic bulk loading using
as we are using hadoop 2.3.
Is there anything i am missing here?
Thanks,
Abe
On Fri, Jul 25, 2014 at 8:55 AM, Gabriel Reid
gabriel.r...@gmail.com
wrote:
Hi Sid,
The location of the jar file looks correct. However, there seems
to be
an issue with the build of hadoop2
Hi,
This is due to the permgen space (a pre-allocated portion of memory
set up by the JVM) being too small. It looks like you're on Windows,
and I'm guessing that you're using a client-mode VM, which I think
mean your permgen is 32 MB.
I'm not really familiar with SQuirreL on windows, but from
Hi Vadim,
Sorry for the long delay on this.
Just to be sure, can you confirm that you're using the hadoop-2 build
of Phoenix 4.0 on the client when starting up the CsvBulkLoadTool?
Even if you are, this may actually require a rebuild of Phoenix using
CDH 5.1.0 dependencies.
Could you post the
)
at
org.apache.hadoop.hbase.client.HTable.checkAndPut(HTable.java:946)
at HAdminTest.testCheckNPut(HAdminTest.java:150)
at HAdminTest.main(HAdminTest.java:257)
Thanks,
~Ashish
On Tue, Aug 5, 2014 at 1:21 PM, Gabriel Reid gabriel.r...@gmail.com wrote:
Hi Ashish,
Could you post the full stack
Hi Abe,
I believe the second part of the Expected single, aggregated
KeyValue error message is Ensure aggregating coprocessors are loaded
correctly on server. The fact that a basic scan isn't finding the
org.apache.phoenix.filter.ColumnProjectionFilter class also points to
the same thing: that
Hi Ashish,
Could you post the full stack trace you're getting when the
checkAndPut fails? No immediate reason I can think of as to why this
would happen.
- Gabriel
On Tue, Aug 5, 2014 at 7:57 PM, ashish tapdiya ashishtapd...@gmail.com wrote:
Folks,
any intuition why this is happening.
the same results for relatively small query
size, but diverge when the query size is really big (30*70 millions rows)
On Tue, Jun 10, 2014 at 2:07 PM, Gabriel Reid gabriel.r...@gmail.com
wrote:
Hi Sean,
That doesn't sound right -- any idea which of the queries (if either)
is returning
98 matches
Mail list logo