Re: Spark - Not contains on Spark dataframe

2017-03-04 Thread KhajaAsmath Mohammed
Hi,

I was able to resolve issue with below conditions.

datapoint_df(Constants.Datapoint.Vin).like("012345")

datapoint_filter_df.filter( datapoint_filter_df(Constants.Datapoint.Vin)
rlike "^([A-Z]|[0-9]|[a-z])+$" ) // for checking alpha numeric.

Thanks,
Asmath

On Tue, Feb 28, 2017 at 10:49 AM, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:

> Hi,
>
> Could anyone please provide me your suggestions on how to resolve the
> issue that I am facing with not contains code on dataframe column.
>
> Here is the code. My dataframe is not getting filtered with below
> conditions. I even tried not and ! on Column. any suggestions?
>
> def filterDatapointRawCountsDF(vin: Column): Column =
>
> {
>
> import org.apache.spark.sql.functions.not
>  val filterColumn: Column = {
>
> not(vin.contains("VIN")) ||
> not(vin.contains("Ÿ")) ||
> not(vin.contains("0123456789ABCDEFG"))
>
> }
>
> filterColumn;
>
> }
>
>
> }
>
>


Re: Spark - Not contains on Spark dataframe

2017-02-28 Thread KhajaAsmath Mohammed
Hi,

MY dataframe has records with below conditions but dataframe never gets
filtered. I am always getting total count of original records even after
using below filter function. Am i doing anything wrong here

Note: I tied OR and || too

def filterDatapointRawCountsDataFrame(datapoint_df: DataFrame): DataFrame = {
  
//println("."+datapoint_df(Constants.Datapoint.Vin).cast(String))
  val x = datapoint_df.filter(
  datapoint_df(Constants.Datapoint.Vin).contains("VIN") or
  datapoint_df(Constants.Datapoint.Vin).contains("XXX") or
  datapoint_df(Constants.Datapoint.Vin).contains("Ÿ") or
  datapoint_df(Constants.Datapoint.Vin).contains("0123456789ABCDEFG") or
  datapoint_df(Constants.Datapoint.Vin).contains("") or
  datapoint_df(Constants.Datapoint.Vin).contains("XXX") or
  datapoint_df(Constants.Datapoint.Vin).contains("") or
  datapoint_df(Constants.Datapoint.Vin).contains("") or
  datapoint_df(Constants.Datapoint.Vin).contains("") or
  datapoint_df(Constants.Datapoint.Vin).contains("@") or
  datapoint_df(Constants.Datapoint.Vin).contains("?") or
  datapoint_df(Constants.Datapoint.Vin).contains("*") or
  datapoint_df(Constants.Datapoint.Vin).contains(" ") or
  datapoint_df(Constants.Datapoint.Vin).contains("FFF") or
  datapoint_df(Constants.Datapoint.Vin).contains("INTAKE") or
  datapoint_df(Constants.Datapoint.Vin).contains("SERVICES") or
  datapoint_df(Constants.Datapoint.Vin).contains("11") or
  datapoint_df(Constants.Datapoint.Vin).contains("JJ") or
  datapoint_df(Constants.Datapoint.Vin).contains("AA") or
  datapoint_df(Constants.Datapoint.Vin).contains("BB") or
  datapoint_df(Constants.Datapoint.Vin).contains("TT") or
  datapoint_df(Constants.Datapoint.Vin).contains("NUMBER") or
  datapoint_df(Constants.Datapoint.Vin).contains("NOTSUREATTHISTIME") or
  datapoint_df(Constants.Datapoint.Vin).contains("1800CALLJOESNAPPY") or
  datapoint_df(Constants.Datapoint.Vin).contains("34567890") or
  datapoint_df(Constants.Datapoint.Vin).contains("012345") or
  datapoint_df(Constants.Datapoint.Vin).contains("JALES") or
  datapoint_df(Constants.Datapoint.Vin).contains("SATAN") or
  datapoint_df(Constants.Datapoint.Vin).contains("SIMULAT")
  )
  x;
}


Thanks,

Asmath


On Tue, Feb 28, 2017 at 10:49 AM, KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:

> Hi,
>
> Could anyone please provide me your suggestions on how to resolve the
> issue that I am facing with not contains code on dataframe column.
>
> Here is the code. My dataframe is not getting filtered with below
> conditions. I even tried not and ! on Column. any suggestions?
>
> def filterDatapointRawCountsDF(vin: Column): Column =
>
> {
>
> import org.apache.spark.sql.functions.not
>  val filterColumn: Column = {
>
> not(vin.contains("VIN")) ||
> not(vin.contains("Ÿ")) ||
> not(vin.contains("0123456789ABCDEFG"))
>
> }
>
> filterColumn;
>
> }
>
>
> }
>
>


Spark - Not contains on Spark dataframe

2017-02-28 Thread KhajaAsmath Mohammed
Hi,

Could anyone please provide me your suggestions on how to resolve the issue
that I am facing with not contains code on dataframe column.

Here is the code. My dataframe is not getting filtered with below
conditions. I even tried not and ! on Column. any suggestions?

def filterDatapointRawCountsDF(vin: Column): Column =

{

import org.apache.spark.sql.functions.not
 val filterColumn: Column = {

not(vin.contains("VIN")) ||
not(vin.contains("Ÿ")) ||
not(vin.contains("0123456789ABCDEFG"))

}

filterColumn;

}


}