.
Any thoughts or pointers to relevant design documents appreciated...
Thanks!
Jakub Dubovsky
hat is fixed would be for you to manually specify the
> kryo
> encoder
> <http://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/Encoders.html#kryo(scala.reflect.ClassTag)>
> .
>
> On Thu, Dec 15, 2016 at 8:18 AM, Jakub Dubovsky <
> spark.dubovsky.ja...@gmail.c
efined case classes containing
scala.collection.immutable.List(s). This does not work now because these
lists are converted to ArrayType (Seq). This then fails a constructor
lookup because of seq-is-not-a-list error...
This means that for now we are stuck with using RDDs.
Thanks for any insights!
SparkSession from somewhere. by importing the implicits from
>>> spark.implicits._ they have access to a SparkSession for operations like
>>> this.
>>>
>>> On Fri, Oct 14, 2016 at 4:42 PM, Jakub Dubovsky <
>>> spark.dubovsky.ja...@gmail.com> wrote:
>>>
>&
Hey community,
I would like to *educate* myself about why all *sql implicits* (most
notably conversion to Dataset API) are imported from *instance* of
SparkSession and not using static imports.
Having this design one runs into problems like this
omputation of "top N" on a Dataset, so I don't think this is
> relevant.
>
>
> orderBy + take is already the way to accomplish "Dataset.top". It works
> on Datasets, and therefore DataFrames too, for the reason you give. I'm not
> sure what you're asking there
taFrame-like
> counterpart already that doesn't really need wrapping in a different
> API.
>
> On Thu, Sep 1, 2016 at 12:53 PM, Jakub Dubovsky
> <spark.dubovsky.ja...@gmail.com> wrote:
> > Hey all,
> >
> > in RDD api there is very usefull method called top. It finds
Hey all,
in RDD api there is very usefull method called top. It finds top n records
in according to certain ordering without sorting all records. Very usefull!
There is no top method nor similar functionality in Dataset api. Has
anybody any clue why? Is there any specific reason for this?
Any
this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 31 August 2016 at 14:53, Jakub Dubovsky <spark.dubovsky.ja...@gmail.com
> > wrote:
>
Hey all,
I have a conceptual question which I have hard time finding answer for.
Is the jvm where spark driver is running also used to run computations over
rdd partitions and persist them? The answer is obvious for local mode
(yes). But when it runs on yarn/mesos/standalone with many executors
e:
> Hi,
>
> An argument for `functions.count` is needed for per-column counting;
> df.groupBy($"a").agg(count($"b"))
>
> // maropu
>
> On Thu, Jun 23, 2016 at 1:27 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> See the first example in:
&g
; Are you referring to the following method in
> sql/core/src/main/scala/org/apache/spark/sql/functions.scala :
>
> def count(e: Column): Column = withAggregateFunction {
>
> Did you notice this method ?
>
> def count(columnName: String): TypedColumn[Any, Long] =
>
> On
Hey sparkers,
an aggregate function *count* in *org.apache.spark.sql.functions* package
takes a *column* as an argument. Is this needed for something? I find it
confusing that I need to supply a column there. It feels like it might be
distinct count or something. This can be seen in latest
I did not realized that scala's and java's immutable collections uses
different api which causes this. Thank you for reminder. This makes some
sense now...
-- Původní zpráva --
Od: Jonathan Coveney <jcove...@gmail.com>
Komu: Jakub Dubovsky <spark.dubovsky.ja...@seznam.
But I cannot think of a workaround and I do not
believe that using ImmutableList with RDD is not possible. How this is
solved?
Thank you in advance!
Jakub Dubovsky
which would translate the data during (de)
serialization?
Thanks!
Jakub Dubovsky
-- Původní zpráva --
Od: Igor Berman <igor.ber...@gmail.com>
Komu: Jakub Dubovsky <spark.dubovsky.ja...@seznam.cz>
Datum: 5. 10. 2015 20:11:35
Předmět: Re: RDD of ImmutableList
&qu
Hi DB,
I cherry-picked the commit into branch-1.2 and it solved the problem. It
solves the problem but has some bits and pieces around which was not
finalized thus reverted beeing late in release process.
Jakub
--
Just out of my curiosity. Do you manually apply this patch and see if
17 matches
Mail list logo