:44 PM
To: Jake Russ <jr...@bloomintelligence.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Subject: Re: Update MySQL table via Spark/SparkR?
Hi Jake,
This is an issue across all RDBMs including Oracle etc. When you are updating
you have to commit or roll back in RDB
Hi everyone,
I’m currently using SparkR to read data from a MySQL database, perform some
calculations, and then write the results back to MySQL. Is it still true that
Spark does not support UPDATE queries via JDBC? I’ve seen many posts on the
internet that Spark’s DataFrameWriter does not
The ScalaTest code that is enclosed at the end of this email message
demonstrates what appears to be a bug in the KryoSerializer. This code was
executed from IntelliJ IDEA (community edition) under Mac OS X 10.11.2
The KryoSerializer is enabled by updating the original SparkContext (that is
I need to register with KryoSerializer a Tuple3 that is generated by a call to
the sortBy() method that eventually calls collect() from
Partitioner.RangePartitioner.sketch().
The IntelliJ Idea debugger indicates that the for the Tuple3 are
java.lang.Integer, java.lang.Integer and long[]. So,
the associated source code, so
if anyone has suggestions for improvement, please feel free to communicate them
to me.
Thanks,
Russ Brown
Distributed R-Trees are not very common. Most "big data" spatial solutions
collapse multi-dimensional data into a distributed one-dimensional index
using a space-filling curve. Many implementations exist outside of Spark
for eg. Hbase or Accumulo. It's simple enough to write a map function that
representation of
an entire logical row; it's a useful convenience if you can be sure that
your rows always fit in memory.
I haven't tested it since Spark 1.0.1 but I doubt anything important has
changed.
Regards,
-Russ
On Thu, Mar 26, 2015 at 11:41 AM, David Holiday dav...@annaisystems.com
wrote
will be better for your cluster.
-Russ
On Mon, Sep 29, 2014 at 7:43 PM, Nan Zhu zhunanmcg...@gmail.com wrote:
can you look at your HBase UI to check whether your job is just reading
from a single region server?
Best,
--
Nan Zhu
On Monday, September 29, 2014 at 10:21 PM, Tao Xiao wrote:
I
I use newAPIHadoopRDD with AccumuloInputFormat. It produces a PairRDD using
Accumulo's Key and Value classes, both of which extend Writable. Works like
a charm. I use the same InputFormat for all my MR jobs.
-Russ
On Wed, Sep 24, 2014 at 9:33 AM, Steve Lewis lordjoe2...@gmail.com wrote:
I
No, they do not implement Serializable. There are a couple of places where
I've had to do a Text-String conversion but generally it hasn't been a
problem.
-Russ
On Wed, Sep 24, 2014 at 10:27 AM, Steve Lewis lordjoe2...@gmail.com wrote:
Do your custom Writable classes implement Serializable - I
(hadoopJob.getConfiguration(),
AccumuloInputFormat.class, Key.class, Value.class);
}
There's tons of docs around how to operate on a JavaPairRDD. But you're
right, there's hardly anything at all re. how to plug accumulo into spark.
-Russ
On Wed, Sep 10, 2014 at 1:17 PM, Megavolt jbru...@42six.com wrote:
I've
down to 30s from 18 minutes and I'm seeing much better
utilization of my accumulo tablet servers.
-Russ
On Tue, Sep 9, 2014 at 5:13 PM, Russ Weeks rwe...@newbrightidea.com wrote:
Hi,
I'm trying to execute Spark SQL queries on top of the AccumuloInputFormat.
Not sure if I should be asking
tablet servers with active scans. Since the data is
spread across all the tablet servers, I hoped to see 8!
I realize there are a lot of moving parts here but I'd any advice about
where to start looking.
Using Spark 1.0.1 with Accumulo 1.6.
Thanks!
-Russ
13 matches
Mail list logo