Hi Josh,
thank for your reply, I'm trying to implement a bulk save to Phoenix
with Apache Spark, and the code you linked helped me a lot. I'm now
facing an issue with composite primary keys, I cannot find anywhere in
the Phoenix code where the row-key is built using the partial phoenix
primar
Hi Antonio,
Certainly, a JIRA ticket with a patch would be fantastic.
Thanks!
Josh
On Wed, Sep 28, 2016 at 12:08 PM, Antonio Murgia
wrote:
> Thank you very much for your insights Josh, if I decide to develop a small
> Phoenix Library that does, through Spark, what the CSV loader does, I'll
>
Thank you very much for your insights Josh, if I decide to develop a
small Phoenix Library that does, through Spark, what the CSV loader
does, I'll surely write to the mailing list, or open a Jira, or maybe
even open a PR, right?
Thank you again
#A.M.
On 09/28/2016 05:10 PM, Josh Mahonin wr
Hi Antonio,
You're correct, the phoenix-spark output uses the Phoenix Hadoop
OutputFormat under the hood, which effectively does a parallel, batch JDBC
upsert. It should scale depending on the number of Spark executors,
RDD/DataFrame parallelism, and number of HBase RegionServers, though
admittedl
Hi,
I would like to perform a Bulk insert to HBase using Apache Phoenix from
Spark. I tried using Apache Spark Phoenix library but, as far as I was
able to understand from the code, it looks like it performs a jdbc batch
of upserts (am I right?). Instead I want to perform a Bulk load like the
one