Append is not working with data frame

Anil Langote Wed, 20 Apr 2016 11:24:01 -0700

Hi All,

We are pulling the data from oracle tables and writing them using partitions as 
parquet files, this is daily process it works fine till 18th day (18 days load 
works fine), however on 19 th day load the data frame load process hangs and 
load action called more than once, if we remove the data and just run for 19th 
day it loads the data successfully, why the load fails for 19th day in APPEND 
mode where as the 19th day works fine. On Spark UI we can see first load job 
takes around 5 min and duplicate load jobs just takes few seconds, we are stuck 
with this we want to process 60 days of data.


∂
Thank you 
Anil Langote


> On Apr 20, 2016, at 1:12 PM, Wei Chen <wei.chen.ri...@gmail.com> wrote:
> 
> Found it. In case someone else if looking for this:
> cvModel.bestModel.asInstanceOf[org.apache.spark.ml.classification.LogisticRegressionModel].weights
> 
> On Tue, Apr 19, 2016 at 1:12 PM, Wei Chen <wei.chen.ri...@gmail.com 
> <mailto:wei.chen.ri...@gmail.com>> wrote:
> Hi All,
> 
> I am using the example of model selection via cross-validation from the 
> documentation here: http://spark.apache.org/docs/latest/ml-guide.html 
> <http://spark.apache.org/docs/latest/ml-guide.html>. After I get the 
> "cvModel", I would like to see the weights for each feature for the best 
> logistic regression model. I've been looking at the methods and attributes of 
> this "cvModel" and "cvModel.bestModel" and still can't figure out where these 
> weights are referred. It must be somewhere since we can use "cvModel" to 
> transform a new dataframe. Your help is much appreciated.
> 
> 
> Thank you,
> Wei
> 
> 
> 
> -- 
> Wei Chen, Ph.D.
> Astronomer and Data Scientist
> Phone: (832)646-7124
> Email: wei.chen.ri...@gmail.com <mailto:wei.chen.ri...@gmail.com>
> LinkedIn: https://www.linkedin.com/in/weichen1984 
> <https://www.linkedin.com/in/weichen1984>

Append is not working with data frame

Reply via email to