Nice! I am especially interested in Bayesian Networks, which are only one
of the many models that can be expressed by a factor graph representation.
Do you do Bayesian Networks learning at scale (parameters and structure)
with latent variables? Are you using publicly available tools for that?
e.
2. Who is or was using the *interruptOnCancel* ? Do you got burn? It is
still working without any incident?
Thanks in advance for any info, feedbacks and war stories.
Bertrand Dechoux
The big question would be what feature of Esper your are using. Esper is a
CEP solution. I doubt that Spark Streaming can do everything Esper does
without any development. Spark (Streaming) is more a general-purpose
platform.
http://www.espertech.com/products/esper.php
But I would be glad to be
Well, anyone can open an account on apache jira and post a new
ticket/enhancement/issue/bug...
Bertrand Dechoux
On Fri, Jul 25, 2014 at 4:07 PM, Sparky gullo_tho...@bah.com wrote:
Thanks for the suggestion. I can confirm that my problem is I have files
with zero bytes. It's a known bug
Is there any documentation from cloudera on how to run Spark apps on CDH
Manager deployed Spark ?
Asking the cloudera community would be a good idea.
http://community.cloudera.com/
In the end only Cloudera will fix quickly issues with CDH...
Bertrand Dechoux
On Wed, Jul 23, 2014 at 9:28 AM
And you might want to apply clustering before. It is likely that every user
and every item are not unique.
Bertrand Dechoux
On Fri, Jul 18, 2014 at 9:13 AM, Nick Pentreath nick.pentre...@gmail.com
wrote:
It is very true that making predictions in batch for all 1 million users
against the 10k
functions with no side effect (ie the only
impact is the returned results), then you just need to not take into
account results from additional attempts of the same task/operator.
Bertrand Dechoux
On Tue, Jul 15, 2014 at 9:34 PM, Andrew Ash and...@andrewash.com wrote:
Hi Nan,
Great digging
A patch proposal on the apache JIRA for Spark?
https://issues.apache.org/jira/browse/SPARK/
Bertrand
On Thu, Jul 10, 2014 at 2:37 PM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
And also that there is a small bug in implementation. As I mentioned this
earlier also.
This is my first
of it.
Regards
Bertrand Dechoux
For the second question, I would say it is mainly because the projects have
not the same aim. Impala does have a cost-based optimizer and predicate
propagation capability which is natural because it is interpreting
pseudo-SQL query. In the realm of relational database, it is often not a
good idea
I guess you have to understand the difference of architecture. I don't know
much about C++ MPI but it is basically MPI whereas Spark is inspired from
Hadoop MapReduce and optimised for reading/writing large amount of data
with a smart caching and locality strategy. Intuitively, if you have a high
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html
We do not currently cache blocks which are under construction, corrupt, or
otherwise incomplete.
Have you tried with a file with more than 1 block?
And
http://spark-summit.org ?
Bertrand
On Thu, May 8, 2014 at 2:05 AM, Ian Ferreira ianferre...@hotmail.comwrote:
Folks,
I keep getting questioned on real world experience of Spark as in mission
critical production deployments. Does anyone have some war stories to share
or know of resources
Cool, thanks for the link.
Bertrand Dechoux
On Mon, Apr 21, 2014 at 7:31 PM, Nick Pentreath nick.pentre...@gmail.comwrote:
Also see: https://github.com/apache/spark/pull/455
This will add support for reading sequencefile and other inputformat in
PySpark, as long as the Writables are either
Hi,
I have browsed the online documentation and it is stated that PySpark only
read text files as sources. Is it still the case?
From what I understand, the RDD can after this first step be any serialized
python structure if the class definitions are well distributed.
Is it not possible to read
I don't know the Spark issue but the Hadoop context is clear.
old api - org.apache.hadoop.mapred
new api - org.apache.hadoop.mapreduce
You might only need to change your import.
Regards
Bertrand
On Wed, Mar 19, 2014 at 11:29 AM, Pariksheet Barapatre pbarapa...@gmail.com
wrote:
Hi,
16 matches
Mail list logo