Hi,
I am using 1 master and 3 slave workers for processing 27gb of Wikipedia
data that is tab separated and every line contains wikipedia page
information. The tab separated data has title of the page and the page
contents. I am using the regular expression to extract links as mentioned in
the
I just thought may be we could put a warning whenever that error comes user
can tune either memoryFraction or executor memory options. And this warning
get's displayed when TaskSetManager receives task failures due to OOM.
Prashant Sharma
On Mon, May 5, 2014 at 2:10 PM, Ajay Nair
Hi,
Why mllib vector is using double as default ?
/**
* Represents a numeric vector, whose index type is Int and value type is
Double.
*/
trait Vector extends Serializable {
/**
* Size of the vector.
*/
def size: Int
/**
* Converts the instance to a double array.
+1 Would be nice that we can use different type in Vector.
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Mon, May 5, 2014 at 2:41 PM, Debasish Das debasish.da...@gmail.comwrote:
Hi,
Is this a breeze issue or breeze can take templates on float / double ?
If breeze can take templates then it is a minor fix for Vectors.scala right
?
Thanks.
Deb
On Mon, May 5, 2014 at 2:45 PM, DB Tsai dbt...@stanford.edu wrote:
+1 Would be nice that we can use different type in Vector.
Breeze could take any type (Int, Long, Double, and Float) in the matrix
template.
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Mon, May 5, 2014 at 2:56 PM, Debasish Das
Is any one facing issues due to this ? If not then I guess doubles are
fine...
For me it's not a big deal as there is enough memory available...
On Mon, May 5, 2014 at 3:06 PM, David Hall d...@cs.berkeley.edu wrote:
Lbfgs and other optimizers would not work immediately, as they require
I fixed index type and value type to make things simple, especially
when we need to provide Java and Python APIs. For raw features and
feature transmations, we should allow generic types. -Xiangrui
On Mon, May 5, 2014 at 3:40 PM, DB Tsai dbt...@stanford.edu wrote:
David,
Could we use Int,
On Mon, May 5, 2014 at 3:40 PM, DB Tsai dbt...@stanford.edu wrote:
David,
Could we use Int, Long, Float as the data feature spaces, and Double for
optimizer?
Yes. Breeze doesn't allow operations on mixed types, so you'd need to
convert the double vectors to Floats if you wanted, e.g. dot
Hi,
I have seen three different ways to query data from Spark
1. Default SQL support(
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/sql/examples/HiveFromSpark.scala
)
2. Shark
3. Blink DB
I would like know which one is more efficient
10 matches
Mail list logo