ask can read and process
> unsplittable data - versus many tasks spread across the cluster.
>
> On Wed, Dec 30, 2015 at 6:45 AM, Dawid Wysakowicz <
> wysakowicz.da...@gmail.com> wrote:
>
>> Didn't anyone used spark with orc and snappy compression?
>>
>> 2015-12-
Didn't anyone used spark with orc and snappy compression?
2015-12-29 18:25 GMT+01:00 Dawid Wysakowicz <wysakowicz.da...@gmail.com>:
> Hi,
>
> I have a table in hive stored as orc with compression = snappy. I try to
> execute a query on that table that fails (previously I run it
on that matter.
Regards
Dawid Wysakowicz
Hi Ajay,
In short story: No, there is no easy way to do that. But if you'd like to
play around this topic a good starting point would be this blog post from
sequenceIQ: blog
http://blog.sequenceiq.com/blog/2014/08/22/spark-submit-in-java/.
I heard rumors that there are some work going on to
at 7:50 AM, Muhammad Atif muhammadatif...@gmail.com
wrote:
Hi Dawid
The best pace to get started is the Spark SQL Guide from Apache
http://spark.apache.org/docs/latest/sql-programming-guide.html
Regards
Muhammad
On Thu, Aug 20, 2015 at 5:46 AM, Dawid Wysakowicz
wysakowicz.da
Hi,
I would like to dip into SparkSQL. Get to know better the architecture,
good practices, some internals. Could you advise me some materials on this
matter?
Regards
Dawid
I am not 100% sure but probably flatMap unwinds the tuples. Try with map
instead.
2015-08-19 13:10 GMT+02:00 Jerry OELoo oylje...@gmail.com:
Hi.
I want to parse a file and return a key-value pair with pySpark, but
result is strange to me.
the test.sql is a big fie and each line is usename
No, the data is not stored between two jobs. But it is stored for a
lifetime of a job. Job can have multiple actions run.
For a matter of sharing an rdd between jobs you can have a look at Spark
Job Server(spark-jobserver https://github.com/ooyala/spark-jobserver) or
some In-Memory storages:
-- Forwarded message --
From: Dawid Wysakowicz wysakowicz.da...@gmail.com
Date: 2015-08-14 9:32 GMT+02:00
Subject: Re: Using unserializable classes in tasks
To: mark manwoodv...@googlemail.com
I am not an expert but first of all check if there is no ready connector
(you mentioned
, Dawid Wysakowicz wysakowicz.da...@gmail.com
wrote:
I am not an expert but first of all check if there is no ready connector
(you mentioned Cassandra - check: spark-cassandra-connector
https://github.com/datastax/spark-cassandra-connector ).
If you really want to do sth on your own all
10 matches
Mail list logo