Hi again,
On Wed, Jan 14, 2015 at 10:06 AM, Tobias Pfeiffer t...@preferred.jp wrote:
If you think of
items.map(x = /* throw exception */).count()
then even though the count you want to get does not necessarily require
the evaluation of the function in map() (i.e., the number is the
Hi,
On Wed, Jan 14, 2015 at 12:11 PM, Tobias Pfeiffer t...@preferred.jp wrote:
Now I don't know (yet) if all of the functions I want to compute can be
expressed in this way and I was wondering about *how much* more expensive
we are talking about.
OK, it seems like even on a local machine
Thanks for the tips!
Yeah its as a working SBT project. I.e. if I do an SBT run it picks up
Test1 as a main class and runs it for me without error. Its only in
IntelliJ. I opened the project from the folder afresh by choosing the
build.sbt file. I re-tested by deleting .idea and just choosing the
I experimented with using getResourceAsStream(cls, fileName) instead
cls.getResource(fileName).toURI. That works!
I have no idea why the latter method does not work in Spark. Any
explanations would be welcome.
Thanks,
arun
On Tue, Jan 13, 2015 at 6:35 PM, Arun Lists lists.a...@gmail.com wrote:
Alternative to doing a naive toArray is to declare an accumulator per partition
and use that. It's specifically what they were designed to do. See the
programming guide.
Sent with Good (www.good.com)
-Original Message-
From: Tobias Pfeiffer
I believe the default hash partitioner logic in spark will send all the
same keys to same machine.
On Wed, Jan 14, 2015, 03:03 Puneet Kapoor puneet.cse.i...@gmail.com wrote:
Hi,
I have a usecase where in I have hourly spark job which creates hourly
RDDs, which are partitioned by keys.
At
Hi,
I have an RDD[(Long, MyData)] where I want to compute various functions on
lists of MyData items with the same key (this will in general be a rather
short lists, around 10 items per key).
Naturally I was thinking of groupByKey() but was a bit intimidated by the
warning: This operation may be
There is no binding issue here. Spark picks the right ip 10.211.55.3 for you.
The printed message is just an indication.
However I have no idea why spark-shell hangs or stops.
发自我的 iPhone
在 2015年1月14日,上午5:10,Akhil Das ak...@sigmoidanalytics.com 写道:
It just a binding issue with the
Hi,
I just test groupByKey method on a 100GB data, the cluster is 20
machine, each with 125GB RAM.
At first I set conf.set(spark.shuffle.use.netty, false) and run
the experiment, and then I set conf.set(spark.shuffle.use.netty, true)
again to re-run the experiment, but at the latter
there is no way to avoid shuffle if you use combine by key, no matter if
your data is cached in memory, because the shuffle write must write the
data into disk. And It seem that spark can not guarantee the similar
key(K1) goes to the Container_X.
you can use the tmpfs for your shuffle dir, this
By the way, I am not sure enough wether the shuffle key can go into the
similar container.
Hi,
We are using Scala app to connect with the Spark and when we start the
application we get following error:
19:52:47.185 [sparkDriver-akka.actor.default-dispatcher-2] INFO
o.a.s.d.client.AppClient$ClientActor - Connecting to master
spark://172.22.193.138:7077...
19:52:47.198
Let me rephrase that it is a Play-Scala app which is trying to connect with the
Spark.
From: Abhideep Chakravarty
Sent: Wednesday, January 14, 2015 11:34 AM
To: 'user@spark.apache.org'
Subject: Spark 1.2.0 not getting connected
Hi,
We are using Scala app to connect with the Spark and when we
Hi Jaonary Rabarisoa,
Where you able to fix this issue? Actually i am trying to integrate
OpenCV with Spark.It would be very helpful if you could share your
experience in integrating opencv with spark.It would really help me if you
could share some code how to use Mat ,IplImage and spark rdd
To confirm, lihu, are you using Spark version 1.2.0 ?
On Tue, Jan 13, 2015 at 9:26 PM, lihu lihu...@gmail.com wrote:
Hi,
I just test groupByKey method on a 100GB data, the cluster is 20
machine, each with 125GB RAM.
At first I set conf.set(spark.shuffle.use.netty, false) and run
What version are you running? I think spark.shuffle.use.netty was a valid
option only in Spark 1.1, where the Netty stuff was strictly experimental.
Spark 1.2 contains an officially supported and much more thoroughly tested
version under the property spark.shuffle.blockTransferService, which is
Hi, Can somone suggest any Video+image processing library which
works
well with spark. Currently i am trying to integrate OpenCV with Spark. I am
relatively new to both spark and OpenCV It would really help me if someone
could share some sample code how to use Mat ,IplImage and spark
Hi guys,
I'm interested in the IndexedRDD too.
How many rows in the big table that matches the small table in every run?
If the number of rows stay constant, then I think Jem wants the runtime to
stay about constant (i.e. ~ 0.6 second for all cases). However, I agree
with Andrew. The performance
Thanks. The problem is that we'd like it to be picked up by hive.
On Tue Jan 13 2015 at 18:15:15 Davies Liu dav...@databricks.com wrote:
On Tue, Jan 13, 2015 at 10:04 AM, jamborta jambo...@gmail.com wrote:
Hi all,
Is there a way to save dstream RDDs to a single file so that another
Hi Josh,
I was trying out decision tree ensemble using bagging. Here I am spiting
the input using random split and training tree for each of the split. Here
is sample code:
val bags : Int = 10
val models : Array[DecisionTreeModel] =
training.randomSplit(Array.fill(bags)(1.0 / bags)).map {
this looks reasonable to me. As you've done, the important thing is just
to make isSplittable return false.
this shares a bit in common with the sc.wholeTextFile method. It sounds
like you really want something much simpler than what that is doing, but
you might be interested in looking at that
Yep it is in the REPL. I will try your solution and also to submit the whole
thing as a job jar. If this is true, this should be fixed, right? I will check
whether there is a ticket already. Somebody pointed me to
https://issues.apache.org/jira/browse/SPARK-2620 but I need to investigate.
Hello all,
When I try to read data from an HBase table, I get an unread block data
exception. I am running HBase and Spark on a single node (my
workstation). My code is in Java, and I'm running it from the Eclipse
IDE. Here are the versions I'm using :
Cloudera : 2.5.0-cdh5.2.1
Hadoop :
Hi all,
Is there a way to save dstream RDDs to a single file so that another process
can pick it up as a single RDD?
It seems that each slice is saved to a separate folder, using
saveAsTextFiles method.
I'm using spark 1.2 with pyspark
thanks,
--
View this message in context:
Yes, I am running with Scala 2.11. Here is what I see when I do scala
-version
scala -version
Scala code runner version 2.11.4 -- Copyright 2002-2013, LAMP/EPFL
On Tue, Jan 13, 2015 at 2:30 AM, Sean Owen so...@cloudera.com wrote:
It sounds like possibly a Scala version mismatch? are you sure
I am using spark 1.2, and I see a lot of messages like:
ExternalSorter: Thread 66 spilling in-memory map of 5.0 MB to disk (13160
times so far)
I seem to have a lot of memory:
URL: spark://hadoop-m:7077
Workers: 4
Cores: 64 Total, 64 Used
Memory: 328.0 GB Total, 327.0 GB Used
I realized that I was running the cluster with
spark.cassandra.output.concurrent.writes=2,
changing it to 1 did the trick. We realized that the issue was because
spark was producing data at much higher frequency than our small Cassandra
cluster could write and so changing the property value to 1
Hello all,
I wrote some Java code that uses Spark, but for some reason I can't run it
from the command line. I am running Spark on a single node (my
workstation). The program stops running after this line is executed :
SparkContext sparkContext = new SparkContext(spark://myworkstation:7077,
Hi,
I have a program that loads a single avro file using spark SQL, queries it,
transforms it and then outputs the data. The file is loaded with:
val records = sqlContext.avroFile(filePath)
val data = records.registerTempTable(data)
...
Now I want to run it over tens of thousands of Avro files
Hi,
Thanks for the replies, I guess I was hoping for a bit better than linear
scaling, this was performing IndexedRDD.join(RDD)((id, a, b) = (a, b)). In
each join every row in the smaller table is joined to one in the lookup. I
ran the same test with standard RDD joins and there was barely any
On Tue, Jan 13, 2015 at 10:04 AM, jamborta jambo...@gmail.com wrote:
Hi all,
Is there a way to save dstream RDDs to a single file so that another process
can pick it up as a single RDD?
It does not need to a single file, Spark can pick any directory as a single RDD.
Also, it's easy to union
You could be hitting this issue
https://issues.apache.org/jira/browse/SPARK-5098
Thanks
Best Regards
On Wed, Jan 14, 2015 at 7:47 AM, Josh J joshjd...@gmail.com wrote:
Hi,
I'm trying to run Spark Streaming standalone on two nodes. I'm able to run
on a single node fine. I start both workers
I ddn't played with OpenCV yet, but i was just wondering about your
use-case. What exactly are you trying to do?
Thanks
Best Regards
On Wed, Jan 14, 2015 at 12:02 PM, Jishnu Prathap jishnu.prat...@wipro.com
wrote:
Hi, Can somone suggest any Video+image processing library which works well
with
Its clearly saying *Connection refused: 172.22.193.138:7077
http://172.22.193.138:7077 *just make sure your master URL is listed as
spark://172.22.193.138:7077 in the webUI(that running on port 8080), also
be sure that no firewall/network is blocking the connection (simple *telnet
172.22.193.138
Have you tried adding this line?
javax.servlet % javax.servlet-api % 3.0.1 % provided
This made the problem go away for me. It also works without the provided
scope.
ᐧ
On Wed, Jan 14, 2015 at 5:09 AM, Night Wolf nightwolf...@gmail.com wrote:
Thanks for the tips!
Yeah its as a working SBT
Right now, you couldn't. You could load each file as a partition into
Hive, or you need to pack the files together by other tools or spark
job.
On Tue, Jan 13, 2015 at 10:35 AM, Tamas Jambor jambo...@gmail.com wrote:
Thanks. The problem is that we'd like it to be picked up by hive.
On Tue Jan
Yes Andrew, I am. Tried setting spark.yarn.applicationMaster.waitTries to 1
(thanks Sean), but with no luck. Any ideas?
On Tue, Jan 13, 2015 at 7:58 PM, Andrew Or and...@databricks.com wrote:
Hi Anders, are you using YARN by any chance?
2015-01-13 0:32 GMT-08:00 Anders Arpteg
What path you are giving in the saveAsTextFile ?? Can you show the whole
line .
On Tue, Jan 13, 2015 at 11:42 AM, shekhar [via Apache Spark User List]
ml-node+s1001560n21112...@n3.nabble.com wrote:
I still i having this issue with rdd.saveAsTextFile() method.
thanks,
Shekhar reddy
Yeah upon running the test locally I receive:
Pi is roughly 3.139948”
So spark is working, it’s just the application ui that is not…
On Jan 13, 2015, at 1:13 PM, Ganon Pierce ganon.pie...@me.com wrote:
My application logs remain stored as .inprogress files, e.g.
Hi Anders, are you using YARN by any chance?
2015-01-13 0:32 GMT-08:00 Anders Arpteg arp...@spotify.com:
Since starting using Spark 1.2, I've experienced an annoying issue with
failing apps that gets executed twice. I'm not talking about tasks inside a
job, that should be executed multiple
Perhaps I need to change my spark.eventLog.dir to an hdfs directory? Could this
have something to do with the “history server” not having access to my
application logs?
On Jan 13, 2015, at 1:13 PM, Ganon Pierce ganon.pie...@me.com wrote:
My application logs remain stored as .inprogress
Hi,
Could you just trying one thing. Make a directory any where out side
cloudera and than try the same write.
Suppose the directory made is testWrite.
do r.saveAsTextFile(/home/testWrite/)
I think cloudera/tmp folder do not have a write permission for users hosted
other than the cloudera
My application logs remain stored as .inprogress files, e.g.
app-20150113190025-0004.inprogress” even after completion, could this have
something to do with what is going on.
@ Ted Yu
Where do I find the master log? It’s not very obviously labeled in my /tmp/
directory. Sorry if I should
Also, thanks for everyone’s help so far!
On Jan 13, 2015, at 2:04 PM, Ganon Pierce ganon.pie...@me.com wrote:
Yeah upon running the test locally I receive:
Pi is roughly 3.139948”
So spark is working, it’s just the application ui that is not…
On Jan 13, 2015, at 1:13 PM, Ganon
In some classes, I initialize some values from resource files using the
following snippet:
new File(cls.getResource(fileName).toURI)
This works fine in SBT. When I run it using spark-submit, I get a
bunch of errors because the classes cannot be initialized. What can I
do to make such
Hi,
On Mon, Jan 12, 2015 at 8:09 PM, Ganelin, Ilya ilya.gane...@capitalone.com
wrote:
Use the mapPartitions function. It returns an iterator to each partition.
Then just get that length by converting to an array.
On Tue, Jan 13, 2015 at 2:50 PM, Kevin Burton bur...@spinn3r.com wrote:
Hi,
I'm trying to run Spark Streaming standalone on two nodes. I'm able to run
on a single node fine. I start both workers and it registers in the Spark
UI. However, the application says
SparkDeploySchedulerBackend: Asked to remove non-existent executor 2
Any ideas?
Thanks,
Josh
Since starting using Spark 1.2, I've experienced an annoying issue with
failing apps that gets executed twice. I'm not talking about tasks inside a
job, that should be executed multiple times before failing the whole app.
I'm talking about the whole app, that seems to close the previous Spark
This is interesting.
I’m using ObjectInputStream to try to read a file written as
saveAsObjectFile… but it’s not working.
The documentation says:
Write the elements of the dataset in a simple format using Java
serialization, which can then be loaded using SparkContext.objectFile().”
… but
Setting spark.sql.hive.convertMetastoreParquet to true has fixed this.
Regards,Ajay
On Tuesday, January 13, 2015 11:50 AM, Ajay Srivastava
a_k_srivast...@yahoo.com.INVALID wrote:
Hi,I am trying to read a parquet file using -val parquetFile =
sqlContext.parquetFile(people.parquet)
This is almost funny.
I want to dump a computation to the filesystem. It’s just the result of a
Spark SQL call reading the data from Cassandra.
The problem is that it looks like it’s just calling toString() which is
useless.
The example is below.
I assume this is just a (bad) bug.
It is just calling RDD's saveAsTextFile. I guess we should really override
the saveAsTextFile in SchemaRDD (or make Row.toString comma separated).
Do you mind filing a JIRA ticket and copy me?
On Tue, Jan 13, 2015 at 12:03 AM, Kevin Burton bur...@spinn3r.com wrote:
This is almost funny.
I
What query did you run? Parquet should have predicate and column pushdown,
i.e. if your query only needs to read 3 columns, then only 3 will be read.
On Mon, Jan 12, 2015 at 10:20 PM, Ajay Srivastava
a_k_srivast...@yahoo.com.invalid wrote:
Hi,
I am trying to read a parquet file using -
val
After clean build still receiving the same error.
On Jan 6, 2015, at 3:59 PM, Sean Owen so...@cloudera.com wrote:
FWIW I do not see any such error, after a mvn -DskipTests clean package and
./bin/spark-shell from master. Maybe double-check you have done a full
clean build.
On Tue,
Thanks for your answer David,
It is as I thought then. When you write that there could be some approaches
to solve this using Yarn or Mesos, can you give any idea about this? Or
better yet, is there any site with documentation about this issue?
Currently, we are launching our jobs using Yarn, but
Gabon:
Can you check the master log to see if there is some clue ?
Cheers
On Jan 13, 2015, at 2:03 AM, Robin East robin.e...@xense.co.uk wrote:
I’ve just pulled down the latest commits from github, and done the following:
1)
mvn clean package -DskipTests
builds fine
2)
Hi,
I am using spark 1.1.0.
I am using the spark-sql shell to run all the below queries.
I have created an external parquet table using the below SQL:
create external table daily (15 column names)
ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
OK, I still wonder whether it's not better to make one big model. The
usual assumption is that the user's identity isn't predictive per se.
If every customer in your shop is truly unlike the others, most
predictive analytics goes out the window. It's factors like our
location, income, etc that are
I’ve just pulled down the latest commits from github, and done the following:
1)
mvn clean package -DskipTests
builds fine
2)
./bin/spark-shell works
3)
run SparkPi example with no problems:
./bin/run-example SparkPi 10
4)
Started a master
./sbin/start-master.sh
grabbed the MasterWebUI
Yes, that's even what the objectFile javadoc says. It is expecting a
SequenceFile with NullWritable keys and BytesWritable values containing the
serialized values. This looks correct to me.
On Tue, Jan 13, 2015 at 8:39 AM, Kevin Burton bur...@spinn3r.com wrote:
This is interesting.
I’m using
I don't believe you need any such manual steps. I'd undo what you did
there. I have never had to add anything manually to get SBT or Maven
builds working. Just follow the docs on the site.
On Tue, Jan 13, 2015 at 5:29 AM, Rapelly Kartheek
kartheek.m...@gmail.com wrote:
Yes, this proxy problem is
Thanks a lot for the suggestion! This approach makes perfect sense. I think
this what is being addressed by spark-jobserver project:
https://github.com/ooyala/spark-jobserver. Do you know any other
production-ready similar implementations?
On Thu, Jan 8, 2015 at 1:47 PM, Silvio Fiorito
Dear All,
For our requirement, we need to define a SparkContext with SparkConf which
has Cassandra connection details. And this SparkContext need to be shared
for subsequent runJobs and through out the application.
So, How to define SparkContext with Cassandra connection for
spark-jobserver?
(This also works for me on a fresh VM, fresh pull, etc.)
On Tue, Jan 13, 2015 at 10:03 AM, Robin East robin.e...@xense.co.uk wrote:
I’ve just pulled down the latest commits from github, and done the
following:
1)
mvn clean package -DskipTests
builds fine
2)
./bin/spark-shell works
3)
Hi,
I'm trying to load up an SBT project in IntelliJ 14 (windows) running 1.7
JDK, SBT 0.13.5 -I seem to be getting errors with the project.
The build.sbt file is super simple;
name := scala-spark-test1
version := 1.0
scalaVersion := 2.10.4
libraryDependencies += org.apache.spark %%
Hi,
I have been playing around with the indexedRDD (
https://issues.apache.org/jira/browse/SPARK-2365,
https://github.com/amplab/spark-indexedrdd) and have been very impressed
with its performance. Some performance testing has revealed worse than
expected scaling of the join performance*, and I
I want to save to local directory. I have tried the following and get error
r.saveAsTextFile(file:/home/cloudera/tmp/out1)
r.saveAsTextFile(file:///home/cloudera/tmp/out1)
r.saveAsTextFile(file:home/cloudera/tmp/out1)
They all generate the following error
15/01/12 08:31:10 WARN
Hi Jem,
Linear time in scaling on the big table doesn't seem that surprising to
me. What were you expecting?
I assume you're doing normalRDD.join(indexedRDD). If you were to replace
the indexedRDD with a normal RDD, what times do you get?
On Tue, Jan 13, 2015 at 5:35 AM, Jem Tucker
That’s really useful, thanks.
From: Andrew Ash [mailto:and...@andrewash.com]
Sent: 09 January 2015 22:42
To: England, Michael (IT/UK)
Cc: raghavendra.pan...@gmail.com; user
Subject: Re: Cleaning up spark.local.dir automatically
That's a worker setting which cleans up the files left behind by
Seems a bit early for anyone to have published anything regarding spark 1.2.
Direct comparisons between 1.1 and 1.2 seem more likely in the near future; you
should be able to extrapolate to comparisons with other systems that have been
done in the past.
One thing that would be really helpful
Looking at the source code for AbstractGenericUDAFResolver, the following
(non-deprecated) method should be called:
public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
It is called by hiveUdfs.scala (master branch):
val parameterInfo = new
Yes.. but this isn’t what the main documentation says.
The file format isn’t very discoverable..
Also, the documentation doesn’t say anything about the group by 10.. what’s
that about?
Kevin
On Tue, Jan 13, 2015 at 2:28 AM, Sean Owen so...@cloudera.com wrote:
Yes, that's even what the
Hi Folks,
I am trying to run hive context in yarn-cluster mode, but met some error. Does
anybody know what cause the issue.
I use following cmd to build the distribution:
./make-distribution.sh -Phive -Phive-thriftserver -Pyarn -Phadoop-2.4
15/01/13 17:59:42 INFO
It just a binding issue with the hostnames in your /etc/hosts file. You can
set SPARK_LOCAL_IP and SPARK_MASTER_IP in your conf/spark-env.sh file and
restart your cluster. (in that case the spark://myworkstation:7077 will
change to the ip address that you provided eg: spark://10.211.55.3).
Thanks
Hi All,
I am trying with some small data set. It is only 200m, and what I am doing
is just do a distinct count on it.
But there are a lot of spilling happen in the log (I attached in the end of
the email).
Basically I use 10G memory, run on a one-node EMR cluster with r3*8xlarge
instance
Yep did this and can view the masterwebui no problem:
4)
Started a master
./sbin/start-master.sh
grabbed the MasterWebUI from the master log - Started MasterWebUI at
http://x.x.x.x:8080 http://x.x.x.x:8080/
Can view the MasterWebUI from local browser
However, cannot see view the app UI in
Here is the master log:
Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp
::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.3.0-SNAPSHOT-hadoop1.0.4.jar
-XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m
It is not meant to be a public API. If you want to use it, maybe copy the
code out of the package and put it in your own project.
On Fri, Jan 9, 2015 at 7:19 AM, Tae-Hyuk Ahn ahn@gmail.com wrote:
Hi,
I would like to use OpenHashSet
(org.apache.spark.util.collection.OpenHashSet) in my
You could try setting the following to tweak the application a little bit:
.set(spark.rdd.compress,true)
.set(spark.storage.memoryFraction, 1)
.set(spark.serializer, org.apache.spark.serializer.KryoSerializer)
For shuffle behavior, you can look at this document
Attempting to bump this up in case someone can help out after all. I spent
a few good hours stepping through the code today, so I'll summarize my
observations both in hope I get some help and to help others that might be
looking into this:
1. I am setting *spark.sql.parquet.**filterPushdown=true*
Thank, Josh and Reynold. Yes, I can incorporate it to my package and
use it. But I am still wondering why you designed such useful
functions as private.
On Tue, Jan 13, 2015 at 3:33 PM, Reynold Xin r...@databricks.com wrote:
It is not meant to be a public API. If you want to use it, maybe copy
All right, I remove cloudera totally and install spark manually on bare Linux
system and now r.saveAsTextFile(…) works.
Thanks.
Regards,
Ningjun Wang
Consulting Software Engineer
LexisNexis
121 Chanlon Road
New Providence, NJ 07974-1541
From: Prannoy [mailto:pran...@sigmoidanalytics.com]
I had a similar issue, i downgraded my intellij version to 13.1.4 and then
its gone.
Although there was some discussion already happened here and for some
people following was the solution:
Go to Preferences Build, Execution, Deployment Scala Compiler and clear
the Additional compiler options
Awesome.
Thanks
Best Regards
On Tue, Jan 13, 2015 at 10:35 PM, Ankur Srivastava
ankur.srivast...@gmail.com wrote:
I realized that I was running the cluster with
spark.cassandra.output.concurrent.writes=2,
changing it to 1 did the trick. We realized that the issue was because
spark was
Hi,
The following SQL query
select percentile_approx(variables.var1, 0.95) p95
from model
will throw
ERROR SparkSqlInterpreter: Error
org.apache.hadoop.hive.ql.parse.SemanticException: This UDAF does not
support the deprecated getEvaluator() method.
at
I am using spark 1.1 with the ooyala job server (which basically creates long
running spark jobs as contexts to execute jobs in). These contexts have
cached RDDs in memory (via RDD.persist()).
I want to enable the spark.cleaner to cleanup the /spark/work directories
that are created for each app,
Had the same issue. I can't remember what the issue was but this works:
libraryDependencies ++= {
val sparkVersion = 1.2.0
Seq(
org.apache.spark %% spark-core % sparkVersion % provided,
org.apache.spark %% spark-streaming % sparkVersion % provided,
org.apache.spark %%
I find importing a working SBT project into IntelliJ is the way to go.
How did you load the project into intellij?
On Jan 13, 2015, at 4:45 PM, Enno Shioji eshi...@gmail.com wrote:
Had the same issue. I can't remember what the issue was but this works:
libraryDependencies ++= {
Yeah, it's a bug. It has been fixed by
https://issues.apache.org/jira/browse/SPARK-3891 in master.
On Tue, Jan 13, 2015 at 2:41 PM, Ted Yu yuzhih...@gmail.com wrote:
Looking at the source code for AbstractGenericUDAFResolver, the following
(non-deprecated) method should be called:
public
Ah, thx Ted and Yin!
I'll build a new version. :)
Jianshi
On Wed, Jan 14, 2015 at 7:24 AM, Yin Huai yh...@databricks.com wrote:
Yeah, it's a bug. It has been fixed by
https://issues.apache.org/jira/browse/SPARK-3891 in master.
On Tue, Jan 13, 2015 at 2:41 PM, Ted Yu yuzhih...@gmail.com
90 matches
Mail list logo