Tensorframes is a project from databricks (
https://github.com/databricks/tensorframes). No commit for a couple of
months though.
Does anyone have an insight on the status of the project?
On Mon, 12 Dec 2016 at 19:31 Meeraj Kunnumpurath
wrote:
> Apologies. okay, I
Hi
I start playing with both Apache projects and quickly got that exception.
Anyone being able to give some hint on the problem so that I can dig
further.
It seems to be a problem for Spark to load some of the groovy classes ...
Any idea?
Thanks
Guillaume
tog GroovySpark $ $GROOVY_HOME/bin
maybe now with static compiles and the java7
> invoke-dynamic JARs things are better. I'm still unsure I'd use it in
> production, and, given spark's focus on Scala and Python, I'd pick one of
> those two
>
>
> On 18 Nov 2015, at 20:35, tog <guillaume.all...@gmail.com> wrote
Hi Bala
Can't you do a simple dictionnary and map those values to numbers?
Cheers
Guillaume
On 5 November 2015 at 09:54, Balachandar R.A.
wrote:
> HI
>
>
> I am new to spark MLlib and machine learning. I have a csv file that
> consists of around 100 thousand rows and
ever, I read about HashingTF which exactly
> does this quite efficiently and can scale too. Hence, looking for a
> solution using this technique.
>
>
> regards
> Bala
>
>
> On 5 November 2015 at 18:50, tog <guillaume.all...@gmail.com
> <javascript:_e(%7B%7D,'cvml
it with just: (comment out line 27)
println Count of spark: + file.filter({s - s.contains('spark')}).
count()
Thanks
Best Regards
On Sun, Jul 26, 2015 at 12:43 AM, tog guillaume.all...@gmail.com wrote:
Hi
I have been using Spark for quite some time using either scala or python.
I wanted
not doing correctly here.
Thanks
tog Groovy4Spark $ groovy GroovySparkWordcount.groovy
class org.apache.spark.api.java.JavaRDD
true
true
Caught: org.apache.spark.SparkException: Task not serializable
org.apache.spark.SparkException: Task not serializable
at
org.apache.spark.util.ClosureCleaner
Hi
Sorry for this scala/spark newbie question. I am creating RDD which
represent large time series this way:
val data = sc.textFile(somefile.csv)
case class Event(
time: Double,
x: Double,
vztot: Double
)
val events = data.filter(s = !s.startsWith(GMT)).map{s =
the time serie.
On 2 July 2015 at 18:25, Feynman Liang fli...@databricks.com wrote:
What's the error you are getting?
On Thu, Jul 2, 2015 at 9:37 AM, tog guillaume.all...@gmail.com wrote:
Hi
Sorry for this scala/spark newbie question. I am creating RDD which
represent large time series
, 2015 at 2:33 PM, tog guillaume.all...@gmail.com wrote:
Was complaining about the Seq ...
Moved it to
val eventsfiltered = events.sliding(3).map(s = Event(s(0).time,
(s(0).x+s(1).x+s(2).x)/3.0 (s(0).vztot+s(1).vztot+s(2).vztot)/3.0))
and that is working.
Anyway this is not what I wanted to do
, d, e), 2), ((d, e,
f), 3)]
After filter: [((a,b,c), 0), ((d, e, f), 3)], which is what I'm assuming
you want (non-overlapping buckets)? You can then do something like
.map(func(_._1)) to apply func (e.g. min, max, mean) to the 3-tuples.
On Thu, Jul 2, 2015 at 3:20 PM, tog guillaume.all
Hi
Have you tested the Cloudera project:
https://github.com/cloudera/spark-timeseries ?
Let me know how did you progress on that route as I am also interested in
that topic ?
Cheers
On 26 June 2015 at 14:07, Caio Cesar Trucolo truc...@gmail.com wrote:
Hi everyone!
I am working with
. You may want to take a deeper look at
SparkContext.newAPIHadoopRDD to load your data.
On Sat, May 9, 2015 at 4:48 PM, tog guillaume.all...@gmail.com
javascript:_e(%7B%7D,'cvml','guillaume.all...@gmail.com'); wrote:
Hi
I havé an application that currently run using MR. It currently starts
Hi
I havé an application that currently run using MR. It currently starts
extracting information from a proprietary binary file that is copied to
HDFS. The application starts by creating business objects from information
extracted from the binary files. Later those objects are used for further
Hi
I havé an application that currently run using MR. It currently starts
extracting information from a proprietary binary file that is copied to
HDFS. The application starts by creating business objects from information
extracted from the binary files. Later those objects are used for further
15 matches
Mail list logo