ema name (optional)
> -z,–zookeeper Zookeeper quorum to connect to (optional)
> -it,–index-table Index table name to load (optional)
>
>
>
>
> From: Gabriel Reid
> Subject: Re: Error with lines ended with backslash when Bulk Data Loading
Hi
Backslash is the default escape character that is used for parsing CSV
data when running a bulk import, so it has a special meaning.
You can supply a different (custom) escape character with the -e or
--escape flag on the command line so that parsing your CSV files that
include backslashes lik
Hi Zack,
My initial gut feeling is that this doesn't have anything to do with the
commas in the input data, but it looks like instead the pipe separator
isn't being taken into account. Has this been working for you with other
data files?
I've got more questions than answers to start with:
* Whic
Hi Zack,
Am I correct in understanding the the files are under a structure like
x/.deflate/csv_file.csv ?
In that case, I believe everything under the .deflate directory will
simply be ignored, as directories whose name start with a period are
considered "hidden" files.
However, assuming the dat
t;
> R’s
>
> Ravi Kumar B
>
>
>
> *From:* Gabriel Reid [mailto:gabriel.r...@gmail.com]
> *Sent:* Wednesday, September 28, 2016 5:51 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: Loading via MapReduce, Not Moving HFiles to HBase
>
>
>
> Hi Ravi,
>
>
>
Hi Ravi,
It looks like those log file entries you posted are from a mapreduce task.
Could you post the output of the command that you're using to start the
actual job (i.e. console output of "hadoop jar ...").
- Gabriel
On Wed, Sep 28, 2016 at 1:49 PM, Ravi Kumar Bommada
wrote:
> Hi All,
>
> I
g the PDataType is some kind of equivalent
> and that this is part of the Phoenix JDBC (fat) driver?
>
> Thanks,
> Ryan
>
>
>
>
>> On 8/24/16, 3:01 AM, "Gabriel Reid" wrote:
>>
>> Hi Ankit,
>>
>> All data stored in HBase is stored in th
ards,
> ANKIT BEOHAR
>
>
> On Wed, Aug 24, 2016 at 1:31 PM, Gabriel Reid
> wrote:
>>
>> Hi Ankit,
>>
>> All data stored in HBase is stored in the form of byte arrays. The
>> conversion from richer types (e.g. date) to byte arrays is one of the
>> (m
Hi Ankit,
All data stored in HBase is stored in the form of byte arrays. The
conversion from richer types (e.g. date) to byte arrays is one of the
(many) functionalities included in Phoenix.
When you add a date value in the form of a string to HBase directly
(bypassing Phoenix), you're simply sav
Hi Tom,
What's the primary key definition of your table? Does it have salted row keys?
In the first example (the one that works) I see a leading byte on the
row key, which makes me think that you're using salting. In the second
example (the one that isn't working) I see the leading "\x00" being
a
Hi Aaron,
I feel like I've seen this one before, but I'm not totally sure.
What I would look at first is a possible hbase-xxx version issue.
Something along these lines that I've seen in the past is that another
uber-jar JDBC driver that is on the classpath also contains hbase or
zookeeper classe
suspect the insert does MapReduce as well or is there some other mechanism
> that would scale?
> (8) Compaction/Statistics Operation on Aggregate Table
>
> I really appreciate all the support. We are trying to run a Phoenix TPCH
> benchmark and are struggling a bit to underst
0)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:350)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:324)
> at
> org.a
Hi Aaron,
I'll answered your questions directly first, but please see the bottom
part of this mail for important additional details.
You can specify the
"hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily" parameter
(referenced from your StackOverflow link) on the command line of you
CsvBulk
Hi Radha,
This looks to me as if there is an issue in your data somewhere past
the first 100 records. The bulk loader isn't supposed to fail due to
issues like this. Instead, it's intended to simply report the problem
input lines and continue on, but it appears that this isn't happening.
Could yo
Hi Vikash,
If I'm not mistaken, the bulk load tool was changed in 4.7 to populate
the main table and index tables in a single job (instead of one job
per table).
However, based on what you're seeing, it sounds like there's a problem
with this change.
Could you verify that only one index table wa
hbase-site.xml.
>> You may take a look at that property.
>>
>> Thanks,
>> Sandeep Nemuri
>> ᐧ
>>
>> On Wed, May 11, 2016 at 1:49 PM, kevin wrote:
>>
>>> Thanks,I did't found fs.defaultFS property be overwritten . And I hav
-all-1.8.jar:
>>
>> /home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:
>> /home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/lib/javax.inject-1.jar:
>>
>> /home/dcos/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar:
&g
t;
> dfs.permissions
> false
>
>
> 2016-05-10 14:32 GMT+08:00 kevin :
>>
>> thanks, what I use is from apache. and hadoop ,hbase was in cluster model
>> with one master and three slaves
>>
>> 2016-05-10 14:17 GMT+08:00 Gabriel Reid :
>>>
>
Hi,
It looks like your setup is using a combination of the local
filesystem and HDFS at the same time, so this looks to be a general
configuration issue.
Are you running on a real distributed cluster, or a single-node setup?
Is this a vendor-based distribution (i.e. HDP or CDH), or apache
release
m.
> But, how would it impact the aggregate queries?
>
> Vamsi Attluri
>
> On Wed, Mar 16, 2016 at 9:06 AM Gabriel Reid wrote:
>>
>> Hi Vamsi,
>>
>> The first thing that I notice looking at the info that you've posted
>> is that you have 13 nodes a
Hi Vamsi,
The first thing that I notice looking at the info that you've posted
is that you have 13 nodes and 13 salt buckets (which I assume also
means that you have 13 regions).
A single region is the unit of parallelism that is used for reducers
in the CsvBulkLoadTool (or HFile-writing MapReduc
Hi Vamsi,
I can't answer your question abotu the Phoenix-Spark plugin (although
I'm sure that someone else here can).
However, I can tell you that the CsvBulkLoadTool does not write to the
WAL or to the Memstore. It simply writes HFiles and then hands those
HFiles over to HBase, so the memstore a
Hi Zack,
If bulk loading is currently slow or error prone, I don't think that
this approach would improve the situation.
>From what I understand from that link, this is a way to copy the
contents of a Hive table into HFiles. Hive operates via mapreduce
jobs, so this is technically a map reduce jo
> James
>
> On Fri, Jan 29, 2016 at 11:03 PM, Parth Sawant
> wrote:
>>
>> Hi Gabriel,
>> This worked perfectly.
>>
>> Thanks a lot.
>> Parth S
>>
>> On Fri, Jan 29, 2016 at 10:29 PM, Gabriel Reid
>> wrote:
>>>
>>&
Hi Parth,
Setting the "fs.permissions.umask-mode" config setting to "000" should
do the job. You can do this in your hadoop config on the machine where
you're submitting the job, or just supply it as the leading
command-line parameter as follows:
hadoop jar phoenix-client.jar
org.apache.phoen
.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1444)
>>>
>>> Is my bulkloader command incorrect?
>>>
>>>
>>> Thanks,
>>> Anil Gupta
>>>
>>> On Wed, Dec 30, 2015 at 11:23 AM, anil gupta
>>> wrote:
>>>>
&g
Hi Anil,
This issue was resolved a while back, via this ticket:
https://issues.apache.org/jira/browse/PHOENIX-2238
Unfortunately, that fix is only available starting from Phoenix 4.6
and 4.5.3 (i.e. it wasn't back-ported to 4.4.x).
- Gabriel
On Wed, Dec 30, 2015 at 1:21 AM, anil gupta wrote:
>
On Fri, Dec 18, 2015 at 9:35 PM, Cox, Jonathan A wrote:
>
> The Hadoop version is 2.6.2.
>
I'm assuming the reduce phase is failing with the OOME, is that correct?
Could you run "jps -v" to see what the full set of JVM parameters are
for the JVM that is running the task that is failing? I can't
different ways (just to be sure):
>
> export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx48g"
>
> and also in mapred-site.xml:
>
> mapred.child.java.opts
> -Xmx48g
>
>
> -Original Message-
> From: Gabriel Reid [mailto:g
On Fri, Dec 18, 2015 at 4:31 PM, Riesland, Zack
wrote:
> We are able to ingest MUCH larger sets of data (hundreds of GB) using the
> CSVBulkLoadTool.
>
> However, we have found it to be a huge memory hog.
>
> We dug into the source a bit and found that
> HFileOutputFormat.configureIncrementalLoa
Hi Jonathan,
Sounds like something is very wrong here.
Are you running the job on an actual cluster, or are you using the
local job tracker (i.e. running the import job on a single computer).
Normally an import job, regardless of the size of the input, should
run with map and reduce tasks that h
Hi Jonathan,
It looks like this is a bug that was relatively recently introduced in
the bulk load tool (i.e. that the exit status is not correctly
reported if the bulk load fails). I've logged this as a jira ticket:
https://issues.apache.org/jira/browse/PHOENIX-2538.
This means that for now, ther
This looks like an incompatibility between HBase versions (i.e.
between the version that Phoenix is built against, and the version
that you've got installed on your cluster).
The reason that the bulk loader and fat client are causing issues is
that they include the linked versions of the hbase jar
Hi Afshin,
That looks like a bug to me, although I'm not too confident about
coming up with a good fix for it.
There isn't any handling in the bulk load tool for multiple updates to
the same row in a single input. Basically, the code assumes that a
single given row is only included once in any gi
ch
> also take care of date column?
>
>
> On Wed, Nov 25, 2015 at 3:33 AM, Gabriel Reid
> wrote:
>
>> Indeed, this was a regression. It has since been fixed in PHOENIX-1277
>> [1], and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
>>
>> - Gabriel
>>
Indeed, this was a regression. It has since been fixed in PHOENIX-1277 [1],
and is available in Phoenix 4.4.1 and Phoenix 4.5.0.
- Gabriel
1. https://issues.apache.org/jira/browse/PHOENIX-1277
On Wed, Nov 25, 2015 at 4:07 AM, 彭昶勳 wrote:
> Hi,
>
> In Phoenix-4.3.0 or later version, They change
Re-adding the user list, which I accidentally left off.
On Wed, Nov 18, 2015 at 3:55 PM, Gabriel Reid wrote:
> Yes, I believe that's correct, if you change the umask you make the
> HFiles readable to all during creation.
>
> I believe that the alternate solutions listed o
is the correct
> thing to do ?
>
> conf.set("fs.permissions.umask-mode", "000");
>
>
> Thanks Again
>
> Sanooj
>
> On Wed, Nov 18, 2015 at 12:29 AM, Gabriel Reid
> wrote:
>>
>> Hi Sanooj,
>>
>> I believe that this is rela
Hi Sanooj,
I believe that this is related to the issue described in PHOENIX-976
[1]. In that case, it's not strictly related to Kerberos, but instead
to file permissions (could it be that your dev environment also
doesn't have file permissions turned on?)
If you look at the comments on that jira
On Thu, Oct 29, 2015 at 6:33 PM, James Taylor wrote:
> I seem to remember you starting down that path, Gabriel - a kind of
> pluggable transformation for each row. It wasn't pluggable on the input
> format, but that's a nice idea too, Ravi. I'm not sure if this is what Noam
> needs or if it's some
Hi Noam,
That specific piece of code in CsvBulkLoadTool that you referred to
allows packaging the CsvBulkLoadTool within a different job jar file,
but won't allow setting a different mapper class. The actual setting
of the mapper class is done further down in the submitJob method,
specifically the
:02:19 WARN hbase.HBaseConfiguration: Config option
> "hbase.regionserver.lease.period" is deprecated. Instead, use
> "hbase.client.scanner.timeout.period"
> 15/10/23 06:02:19 INFO util.ChecksumType: Checksum using
> org.apache.hadoop.util.PureJavaCrc32
> 15/1
oot cause of my job actually failing.
>
> I certainly never noticed this before, though.
>
> The main things that we have changed since these scripts worked cleanly were
> upgrading our stack and adding new region servers.
>
> Does that help at all?
>
> -Original Message---
Hi Zack,
I can't give you any information about compatibility of a given
Phoenix version with a given version of HDP (because I don't know).
However, could you give a bit more info on what you're seeing? Are all
import jobs failing with this error for a given set of tables? Or is
this a random fa
Please send a message to user-unsubscr...@phoenix.apache.org to
unsubscribe from this list.
- Gabriel
On Wed, Sep 30, 2015 at 3:05 PM, Ashutosh Sharma
wrote:
>
>
> --
> With best Regards:
> Ashutosh Sharma
sterIdZNode(ZKClusterId.java:65)
> at
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:106)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:858)
>
Hi Zack,
It looks like there is probably an older version of HBase somewhere
(earlier) in the classpath.
I don't know anything about Aqua Data Studio, but could it be that it
somehow bundles support for HBase 0.94 somewhere (or perhaps there is
another JDBC driver on the class path that workds wi
Hi Gaurav,
Looking at your DDL statement, I'm guessing that your table is
currently made up of 33 regions, which means that the time to do a
full count query will take at least as long as it takes to count 27
million rows with a single thread (900 million threads divided by 33
regions).
The most-
Hi Zack,
I've never actually tried it, but I don't see any reason why it
shouldn't work. Starting the job with the parameter
-Dmapreduce.job.queuename= should do the
trick, assuming everything else is set up.
- Gabriel
On Thu, Sep 24, 2015 at 7:05 PM, Riesland, Zack
wrote:
> Hello,
>
>
>
> Can
>> first=\x80\x00\x00\x00\x00\x0009 last=\x80\x00\x00\x00\x00\x01\x092
>
>
> That's why we explicitly provided ---output as hdfs:// and things atleast
> worked.
>
> --
> Dhruv
>
> On Wednesday 23 September 2015 06:54 PM, Gabriel Reid wrote:
>>
>&g
Hi Dhruv,
This is a bug in Phoenix, although it appears that your hadoop
configuration is also somewhat unusual.
As far as I can see, your hadoop configuration is set up to use the
local filesystem, and not hdfs. You can test this by running the
following command:
hadoop dfs -ls /
If that c
; total of more than 3.2 GB temp space is required.
>
> I will of course look at using compression of map output - but just wanted
> to check if this is expected behavior on workloads of this size.
>
> Thanks
> Gaurav
>
>
>
> On 16 September 2015 at 12:21, Gaurav Kanade
>
ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_LENGTH>
> 0
> 0 0 WRONG_MAP
> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209
le ?
>
> Thanks
> Gaurav
>
>
> On 12 September 2015 at 11:16, Gabriel Reid wrote:
>>
>> Around 1400 mappers sounds about normal to me -- I assume your block
>> size on HDFS is 128 MB, which works out to 1500 mappers for 200 GB of
>> input.
>>
>>
not lexically larger than previous
> key"
>
> Thanks a lot!
>
>
> On 15 September 2015 at 19:46, Gabriel Reid wrote:
>>
>> The upsert statements in the MR jobs are used to convert data into the
>> appropriate encoding for writing to an HFile -- the data doe
The upsert statements in the MR jobs are used to convert data into the
appropriate encoding for writing to an HFile -- the data doesn't actually
get pushed to Phoenix from within the MR job. Instead, the created
KeyValues are extracted from the "output" of the upsert statement, and the
statement is
There isn't a command available to rename or change the data type of a
column in Phoenix -- to do something like this, you need to drop a
column and then create a new column.
If you have existing data that you want to migrate, I would suggest
doing the following:
1. Create the new column (with the
Around 1400 mappers sounds about normal to me -- I assume your block
size on HDFS is 128 MB, which works out to 1500 mappers for 200 GB of
input.
To add to what Krishna asked, can you be a bit more specific on what
you're seeing (in log files or elsewhere) which leads you to believe
the data nodes
Hi,
Using prepared statements with Phoenix is basically the same as using
insert or update statements with any other JDBC driver (see
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html for
more details on this in general).
The one difference is that the actual SQL syntax that you u
On Tue, Sep 1, 2015 at 11:29 AM, Riesland, Zack
wrote:
> You say I can find information about spills in the job counters. Are you
> talking about “failed” map tasks, or is there something else that will help
> me identify spill scenarios?
"Spilled records" is a counter that is available at the jo
On Tue, Sep 1, 2015 at 3:04 AM, Behdad Forghani wrote:
> In my experience the fastest way to load data is directly write to HFile. I
> have measured a performance gain of 10x. Also, if you have binary data or
> need to escape characters HBase bulk loader does not escape characters. For
> my use
If the bulk of the time is being spent in the map phase, then there
probably isn't all that much that can be done in terms of tuning that will
make a huge difference. However, there may be a few things to look at.
You mentioned that HDFS decided to translate the hive export to 257 files
-- do you
responding to
> 2015-08-12 10:20:21.125 directly at Phoenix?
>
> Thanks
> בתאריך 18 באוג׳ 2015 21:48, "Gabriel Reid" כתב:
>
> Ok, thanks for those snippets -- I think that's enough to explain what is
>> happening. The biggest cause for confusion here is pr
mple, if I insert the timestamp corresponding to 1-1-2015
> 10:00:00,
> > the inserted timestamp column would be 1-1-2015 07:00:00 and so on.
> > FYI, I use the DateTimeFormatter class for converting the date (which
> comes
> > with GMT+3 suffix) to a TimeStamp object as above
responding to 1-1-2015 10:00:00,
> the inserted timestamp column would be 1-1-2015 07:00:00 and so on.
> FYI, I use the DateTimeFormatter class for converting the date (which comes
> with GMT+3 suffix) to a TimeStamp object as above for inserting the date as
> a TimeStamp.
>
> - Da
Hi David,
How are you upserting timestamps? The phoenix.query.dateFormatTimeZone
config property only affects string parsing or the TO_DATE function (docs
on this are at [1]). If you're using the TO_DATE function, it's also
possible to supply a custom time zone in the function call (docs on this
a
Filtering a query on the leading columns of the primary key (i.e. [A],
[A,B], or [A,B,C]) will give optimal performance. This is because the
records are in sorted order based on the combination of [A,B,C], so
filtering on a leading subset of the primary key is basically the same as
filtering on the
Hi Tom,
I've tried your SQL statements with 4.3.1, and the initial one does indeed
work in 4.3.1 and later. The view definition that includes a CAST statement
still fails (due to an bug in CastParseNode for which I'll post a patch
shortly).
By the way, the way I tested this (and an easy way to te
Hi Zack,
There are two options that I know of, and I think that both of them should
work.
First is that you can supply a custom output directory to the bulk loader
using the -o parameter (see http://phoenix.apache.org/bulk_dataload.html).
In this way you can ensure that the output directory doesn
Are you supplying the -z parameter with the ZooKeeper quorum? This will be
necessary if ZooKeeper isn't running on the localhost and/or isn't
configured in the local configuration (see
http://phoenix.apache.org/bulk_dataload.html).
- Gabriel
On Mon, Jul 6, 2015 at 12:08 PM Riesland, Zack
wrote:
correct?
>
>
>
> That would clear up a lot of confusion…
>
>
>
> *From:* Gabriel Reid [mailto:gabriel.r...@gmail.com]
> *Sent:* Thursday, June 25, 2015 2:44 PM
>
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: Bug in CsvBulkLoad tool?
>
>
>
>
un the CsvBulkLoad tool at the command line and
> have the same SSH window open when it finishes, it is easy to see all the
> statistics.
>
>
>
> But where I can find this data in the logs? Since these ingests can take
> several hours, I sometimes lose my VPN connection and my
Hi Zack,
No, you don't need to worry about the name of the primary key getting in
the way of the rows being added.
Like Anil pointed out, the best thing to look at first is the job counters.
The relevant ones for debugging this situation are the total map inputs and
total map outputs, total reduc
low.
>
> What more do I need to do to increase this timeout effectively?
>
>
>
>
>
> phoenix.query.timeoutMs
>
> 90
>
>
>
>
>
>
>
> Error: org.apache.phoenix.exception.PhoenixIOException: Failed after
> attempts=36,
The default column family name is "0". This is the string containing the
character representation of zero (or in other words, a single byte with
value 48).
And yes, it's possible to read Phoenix tables using the HBase API (although
it's of course a lot easier if you go via Phoenix).
- Gabriel
O
a way to modify my existing table, or would I have to drop it and
> start over?
>
>
>
> *From:* Gabriel Reid [mailto:gabriel.r...@gmail.com]
> *Sent:* Tuesday, June 23, 2015 1:47 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: CsvBulkLoad output questions
>
>
>
>
ws.hasNext(SqlLine.java:2440)
>
> at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074)
>
> at sqlline.SqlLine.print(SqlLine.java:1735)
>
> at sqlline.SqlLine$Commands.execute(SqlLine.java:3683)
>
> at sqlline.SqlLine$Commands.sql(SqlL
Hi Zack,
Would it be possible to provide a few more details on what kinds of
failures that you're getting, both with the CsvBulkLoadTool, and with the
"SELECT COUNT(*)" query?
About question #1, there aren't any known bugs (that I'm aware of) that
would cause some records to go missing in the Csv
ache/phoenix/blob/master/phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java#L99-L100
>
> On Sat, May 16, 2015 at 9:14 AM, Gabriel Reid
> wrote:
>
>> Hi Nick,
>>
>> The date format is (if I'm not mistaken) ISO-8601, so I think you'll
Hi Nick,
The date format is (if I'm not mistaken) ISO-8601, so I think you'll have
to format your date values as 1970-01-01.
- Gabriel
On Fri, May 15, 2015 at 02:02 Nick Dimiduk wrote:
> Heya,
>
> Giving the RC a spin, and also investigating the impact of HBASE-13604, I'm
> having a spot of tr
Hi Kiru,
How many regions are there on this table?
Could you also share some information on the schema of the table (e.g. how
many columns are defined)?
Does a "limit 10" query also hang in this table?
Could you also elaborate a bit on the issues you were running into when
loading data into the
Hi Siva,
Yes, that's pretty much correct -- TO_DATE is returning a Date value, which
has millisecond granularity -- the fact that you're only seeing a date
(with no time component) is due to the way in which the Date is formatted,
and not it's internal value.
- Gabriel
On Sun, May 3, 2015 at 5:2
Hi Siva,
The TO_DATE function returns a java.sql.Date value, and the string
representation of the java.sql.Date value is what you're seeing in your
sqlline session.
The internal long representation of the Date value coming out of TO_DATE
will represent the date to millisecond granularity however.
Hi Kiru,
The CSV bulk loader won't automatically make multiple regions for you, it
simply loads data into the existing regions of the table. In your case, it
means that all data has been loaded into a single region (as you're
seeing), which means that any kind of operations that scan over a large
Hi Matt,
How are you viewing the timestamps (or in other words, how are you
verifying that they're not in GMT)?
The reason I ask is because internally in Phoenix, timestamps are used
without a timezone (they're just based on a long, as you've saved in your
table). However, the java.sql.Timestamp'
That certainly looks like a bug. Would it be possible to make a small
reproducible test case and if possible, log this in the Phoenix JIRA (
https://issues.apache.org/jira/browse/PHOENIX) ?
- Gabriel
On Mon, Apr 6, 2015 at 6:10 PM Marek Wiewiorka
wrote:
> Hi All,
> I came across a weird situati
About the record count differences: the output values of the mapper are
KeyValues, not Phoenix rows. Each column's value is stored in separate
KeyValue, so one input row with a single-column primary key and five other
columns will result in 6 output KeyValues: one KeyValue for each of the
non-prima
Hi Ashish,
The other columns are being cut off by the size of your terminal window. If
you make your window larger, you'll be able to see the additional columns.
- Gabriel
On Thu, Apr 2, 2015 at 9:46 PM ashish tapdiya
wrote:
> I am issuing command "!tables" using sqlline to see all tables. Th
Hi,
I believe Squirrel does some kind of implicit "LIMIT 100" on statements, so
my guess would be that it's adding a "LIMIT 100" to your UPSERT SELECT
statement.
Could you try the same thing using sqlline to verify if it's a problem
there as well?
- Gabriel
On Mon, Mar 30, 2015 at 12:28 PM Dar
Hi,
No, there isn't currently any way to do any kind of variable/parameter
replacement via psql.py. I think there must be some clever ways to do this
automatically on the command line outside of psql, although I'm not aware
of any specific way of doing it.
- Gabriel
On Fri, Mar 27, 2015 at 9:27
You can't set auto-commit via the connection properties or JDBC url in
version 3.0.0. However, this is possible (via the AutoCommit property) as
of version 3.3 and 4.3.
Other client-side properties can be set via the Properties object passed in
to DriverManager.connect.
- Gabriel
On Wed, Mar 18
The correct syntax for a Phoenix JDBC url with a tenant id is as follows:
localhost:2181;TenantId=foo
Note that the TenantId parameter is capitalized (it's case-sensitive).
However (on Linux or Mac at least), it's not currently possible to connect
with a tenant-specific connection like this, as t
Hi Noam,
>From the perspective of row-based data storage you're writing the same
data, but from the perspective of HBase it's not the same at all. This is
because HBase stores everything as KeyValues, with one KeyValue per column
in Phoenix.
Lets say you've got a table with a single primary key c
CDH 5.3.1.
There are a couple of workarounds mentioned in the linked page for the
meantime, but I think the easiest one which should work is just to do
this before starting your import job:
export HADOOP_USER_CLASSPATH_FIRST=true
- Gabriel
On Mon, Mar 2, 2015 at 2:32 PM, Gabriel Reid wrote
)Lorg/joda/time/format/DateTimeFormatter;
>
> 15/03/02 15:21:35 INFO client.HConnectionManager$HConnectionImplementation:
> Closing zookeeper sessionid=0x24bd5223c471259
>
> 15/03/02 15:21:35 INFO zookeeper.ZooKeeper: Session: 0x24bd5223c471259
> closed
>
> 15
Is there more of a stack trace you could post around the error?
I'm guessing that this is something along the lines of
NoSuchMethodException. Could you also post some info on your environment?
Which version of Hadoop and HBase are you using? Is there an alternate
version of joda-time on your class
Hi Matt,
Although the object representation of the Phoenix DECIMAL type is
BigDecimal, the byte-level encoding is different than that of
Bytes.toBytes(BigDecimal). The reason for this is to allow for
ordering of these values based comparison of binary values. Sorting
the values with binary value c
it. Use Phoenix just for sql queries.
>
> I think we should enhance the Phoenix data loader in the same way like Hbase
> loader. What do you say, any thoughts on this?
>
> Thanks,
> Siva.
>
> On Wed, Feb 11, 2015 at 11:34 PM, Gabriel Reid
> wrote:
>>
>> Hi
Hi Siva,
Handling multi-line records with the Bulk CSV Loader (i.e.
MapReduce-based loader) definitely won't support records split over
multiple input lines. It could be that loading via PSQL (as described
on http://phoenix.apache.org/bulk_dataload.html) will allow multi-line
records, as this migh
1 - 100 of 150 matches
Mail list logo