So... one solution would be to use a non-Jurassic version of Jackson. 2.6
will drop before too long, and 3.0 is in longer-term planning. The 1.x
series is long deprecated.
If you're genuinely stuck with something ancient, then you need to include
the JAR that contains the class, and 1.9.13 does
I would suggest checking out disk IO on the nodes in your cluster and then
reading up on the limiting behaviors that accompany different kinds of EC2
storage. Depending on how things are configured for your nodes, you may
have a local storage configuration that provides "bursty" IOPS where you
get
Unfortunately, unless you impose restrictions on the XML file (e.g., where
namespaces are declared, whether entity replacement is used, etc.), you
really can't parse only a piece of it even if you have start/end elements
grouped together. If you want to deal effectively (and scalably) with
large X
We use Luigi for this purpose. (Our pipelines are typically on AWS (no
EMR) backed by S3 and using combinations of Python jobs, non-Spark
Java/Scala, and Spark. We run Spark jobs by connecting drivers/clients to
the master, and those are what is invoked from Luigi.)
—
p...@mult.ifario.us | Multi
Hi, Mans --
Both of those versions of Jackson are pretty ancient. Do you know which of
the Spark dependencies is pulling them in? It would be good for us (the
Jackson, Woodstox, etc., folks) to see if we can get people to upgrade to
more recent versions of Jackson.
-- Paul
—
p...@mult.ifario.u
Hi, Robert --
I wonder if this is an instance of SPARK-2075:
https://issues.apache.org/jira/browse/SPARK-2075
-- Paul
—
p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
On Wed, Jun 25, 2014 at 6:28 AM, Robert James
wrote:
> On 6/24/14, Robert James wrote:
> > My app works f
a you are running with? Are they the same?
>
> Just off the cuff, I wonder if this is related to:
> https://issues.apache.org/jira/browse/SPARK-1520
>
> If it is, it could appear that certain functions are not in the jar
> because they go beyond the extended zip boundary `jar tvf`
Moving over to the dev list, as this isn't a user-scope issue.
I just ran into this issue with the missing saveAsTestFile, and here's a
little additional information:
- Code ported from 0.9.1 up to 1.0.0; works with local[n] in both cases.
- Driver built as an uberjar via Maven.
- Deployed to sma
Hi, Adrian --
If my memory serves, you need 1.7.7 of the various slf4j modules to avoid
that issue.
Best.
-- Paul
—
p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
On Mon, May 12, 2014 at 7:51 AM, Adrian Mocanu wrote:
> Hey guys,
>
> I've asked before, in Spark 0.9 - I now
Hi, Laurent --
That's the way we package our Spark jobs (i.e., with Maven). You'll need
something like this:
https://gist.github.com/prb/d776a47bd164f704eecb
That packages separate driver (which you can run with java -jar ...) and
worker JAR files.
Cheers.
-- Paul
—
p...@mult.ifario.us | Mult
ckaged in a .jar file and I execute .addJar on the
> SparkContext. My expectation is that the whole jar together with that
> function is available on every worker automatically. Is that not a valid
> expectation?
>
> Ognen
>
>
> On 3/13/14, 11:09 AM, Paul Brown wrote:
>
It's trying to send You just need to have the jsonMatches function
available on the worker side of the interaction rather than on the driver
side, e.g., put it on an object CodeThatIsRemote that gets shipped with the
JARs and then filter(CodeThatIsRemote.jsonMatches) and you should be off to
the ra
Hi, Sergey --
Here's my recipe, implemented via Maven; YMMV if you need to do it via sbt,
etc., but it should be equivalent:
1) Replace org.apache.spark.Logging trait with this:
https://gist.github.com/prb/bc239b1616f5ac40b4e5 (supplied by Patrick
during the discussion on the dev list)
2) Amend y
SON text, but they don't have to stay that way. Would
> you recommend I somehow convert the files into another format, say Avro,
> before handling them with Spark?
>
> Paul,
>
> When you say not to write your ser/de as inline blocks, could you provide
> a simple examp
14 matches
Mail list logo