Reynold, what's the idea behind using LLVM?
On Wed, Apr 1, 2015 at 12:31 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Nice try :)
Thanks
Best Regards
On Wed, Apr 1, 2015 at 12:41 PM, Reynold Xin r...@databricks.com wrote:
Hi Spark devs,
I've spent the last few months
Hi,
previously in 1.2.1, the result row from a Spark SQL query was
a org.apache.spark.sql.api.java.Row.
In 1.3.0 I do not see a sql.api.java package. so does it mean that even the
SQL query result row is an implementation of org.apache.spark.sql.Row such
as GenericRow etc?
--
Niranda
Yup - we merged the Java and Scala API so there is now a single set of API
to support both languages.
See more at
http://spark.apache.org/docs/latest/sql-programming-guide.html#unification-of-the-java-and-scala-apis
On Tue, Mar 31, 2015 at 11:40 PM, Niranda Perera niranda.per...@gmail.com
This is a significant effort that Reynold has undertaken, and I am super
glad to see that it's finally taking a concrete form. Would love to see
what the community thinks about the idea.
TD
On Wed, Apr 1, 2015 at 3:11 AM, Reynold Xin r...@databricks.com wrote:
Hi Spark devs,
I've spent the
bq. writing the output (to Amazon S3) failed
What's the value of fs.s3.maxRetries ?
Increasing the value should help.
Cheers
On Wed, Apr 1, 2015 at 8:34 AM, Romi Kuntsman r...@totango.com wrote:
What about communication errors and not corrupted files?
Both when reading input and when writing
What about communication errors and not corrupted files?
Both when reading input and when writing output.
We currently experience a failure of the entire process, if the last stage
of writing the output (to Amazon S3) failed because of a very temporary DNS
resolution issue (easily resolved by
Great! Thank you!
From: Reynold Xin [mailto:r...@databricks.com]
Sent: Thursday, April 02, 2015 8:11 AM
To: Haopu Wang
Cc: user; dev@spark.apache.org
Subject: Re: Can I call aggregate UDF in DataFrame?
You totally can.
Hey Marcelo,
Great question. Right now, some of the more active developers have an
account that allows them to log into this cluster to inspect logs (we
copy the logs from each run to a node on that cluster). The
infrastructure is maintained by the AMPLab.
I will put you in touch the someone
I use Thrift and then base64 encode the binary and save it as text file
lines that are snappy or gzip encoded.
It makes it very easy to copy small chunks locally and play with subsets of
the data and not have dependencies on HDFS / hadoop for server stuff for
example.
On Thu, Mar 26, 2015 at
Sorry for bothering you again, but I think that it is an important issue for
applicability of SGD in Spark MLlib. Could Spark developers please comment on
it.
-Original Message-
From: Ulanov, Alexander
Sent: Monday, March 30, 2015 5:00 PM
To: dev@spark.apache.org
Subject: Stochastic
Thanks, sounds interesting! How do you load files to Spark? Did you consider
having multiple files instead of file lines?
From: Hector Yee [mailto:hector@gmail.com]
Sent: Wednesday, April 01, 2015 11:36 AM
To: Ulanov, Alexander
Cc: Evan R. Sparks; Stephen Boesch; dev@spark.apache.org
This is awesome! I can write the apps for it, to make the Web UI more
functional!
On Wed, Apr 1, 2015 at 12:37 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:
This is a significant effort that Reynold has undertaken, and I am super
glad to see that it's finally taking a concrete form.
Jeremy, thanks for explanation!
What if instead you've used Parquet file format? You can still write a number
of small files as you do, but you don't have to implement a writer/reader,
because they are available for Parquet in various languages.
From: Jeremy Freeman
13 matches
Mail list logo