Hi All,
Spark 1.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-1.5 maintenance branch of Spark. We
*strongly recommend* all 1.5.0 users to upgrade to this release.
The full list of bug fixes is here: http://s.apache.org/spark-1.5.1
http://spark.apach
Exactly, that's a much better way to put it.
Thanks,
Ewan
-- Original message--
From: Yin Huai
Date: Thu, 1 Oct 2015 23:54
To: Ewan Leith;
Cc: r...@databricks.com;dev@spark.apache.org;
Subject:Re: Dataframe nested schema inference from Json without type conflicts
Hi Ewan,
For y
Update for those who are still interested: djinni is a nice tool for
generating Java/C++ bindings. Before today djinni's Java support was only
aimed at Android, but now djinni works with (at least) Debian, Ubuntu, and
CentOS.
djinni will help you run C++ code in-process with the caveat that djinn
Hi Ewan,
For your use case, you only need the schema inference to pick up the
structure of your data (basically you want spark sql to infer the type of
complex values like arrays and structs but keep the type of primitive
values as strings), right?
Thanks,
Yin
On Thu, Oct 1, 2015 at 2:27 PM, Ew
We could, but if a client sends some unexpected records in the schema (which
happens more than I'd like, our schema seems to constantly evolve), its
fantastic how Spark picks up on that data and includes it.
Passing in a fixed schema loses that nice additional ability, though it's what
we'll p
You can pass the schema into json directly, can't you?
On Thu, Oct 1, 2015 at 10:33 AM, Ewan Leith
wrote:
> Hi all,
>
>
>
> We really like the ability to infer a schema from JSON contained in an
> RDD, but when we’re using Spark Streaming on small batches of data, we
> sometimes find that Spark
BTW - the merge window for 1.6 is September+October. The QA window is
November and we'll expect to ship probably early december. We are on a
3 month release cadence, with the caveat that there is some
pipelining... as we finish release X we are already starting on
release X+1.
- Patrick
On Thu, O
Ah - I can update it. Usually i do it after the release is cut. It's
just a standard 3 month cadence.
On Thu, Oct 1, 2015 at 3:55 AM, Sean Owen wrote:
> My guess is that the 1.6 merge window should close at the end of
> November (2 months from now)? I can update it but wanted to check if
> anyone
Depending upon the configured cores assigned to the executor, scheduler
will assign that many tasks. So yes they execute in parallel.
On 30 Sep 2015 14:51, "gsvic" wrote:
> Concerning task execution, a worker executes its assigned tasks in parallel
> or sequentially?
>
>
>
> --
> View this messag
Hi all,
We really like the ability to infer a schema from JSON contained in an RDD, but
when we're using Spark Streaming on small batches of data, we sometimes find
that Spark infers a more specific type than it should use, for example if the
json in that small batch only contains integer value
My guess is that the 1.6 merge window should close at the end of
November (2 months from now)? I can update it but wanted to check if
anyone else has a preferred tentative plan.
On Thu, Oct 1, 2015 at 2:20 AM, Meethu Mathew wrote:
> Hi,
> In the https://cwiki.apache.org/confluence/display/SPARK/W
11 matches
Mail list logo