Re: [Spark SQL]: sql.DataFrame.replace to accept regexp

2019-03-01 Thread Richard Garris
You can file a feature request at https://issues.apache.org/jira/projects/SPARK/ As a workaround you can create a user defined function like so:

Re: flatMap() returning large class

2017-12-18 Thread Richard Garris
-deep-learning/blob/f088de45daec06865ac02a9ec1323eb2c9eebb89/src/main/scala/com/databricks/sparkdl/ImageUtils.scala You can reuse this code potentially. Richard Garris Principal Architect Databricks, Inc 650.200.0840 rlgar...@databricks.com On December 17, 2017 at 3:12:41 PM, Don Drake (dondr

Re: flatMap() returning large class

2017-12-14 Thread Richard Garris
storing it as a vector or Array vs a large Java class object? That might be the more prudent approach. -RG Richard Garris Principal Architect Databricks, Inc 650.200.0840 rlgar...@databricks.com On December 14, 2017 at 10:23:00 AM, Marcelo Vanzin (van...@cloudera.com) wrote: This sounds like

Re: LDA and Maximum Iterations

2016-10-19 Thread Richard Garris
Hi Frank, Two suggestions 1. I would recommend caching the corpus prior to running LDA 2. If you are using EM I would tweak the sample size using the setMiniBatchFraction parameter to decrease the sample per iteration. -Richard On Tue, Sep 20, 2016 at 10:27 AM, Frank Zhang <