Hi There Spark Users,

Been trying to follow allow to this posted gxboost spark databricks notebook 
 however keep getting ValueError: bad input shape ().  

Tried a few things with fixing it … complete SO post with details => 


features = inputTrainingDF.select("features").collect()
lables = inputTrainingDF.select("label").collect()

X = np.asarray(map(lambda v: v[0].toArray(), features))
Y = np.asarray(map(lambda v: v[0], lables))

xgbClassifier = xgb.XGBClassifier(max_depth=3, seed=18238, 

model = xgbClassifier.fit(X, Y)
ValueError: bad input shape () 


def trainXGbModel(partitionKey, labelAndFeatures):
  X = np.asarray(map(lambda v: v[1].toArray(), labelAndFeatures))
  Y = np.asarray(map(lambda v: v[0], labelAndFeatures))
  xgbClassifier = xgb.XGBClassifier(max_depth=3, seed=18238, 
objective='binary:logistic' )
  model =  xgbClassifier.fit(X, Y)
  return [partitionKey, model]

xgbModels = inputTrainingDF\
.select("education", "label", "features")\
.map(lambda row: [row[0], [row[1], row[2]]])\
.map(lambda v: trainXGbModel(v[0], list(v[1])))

ValueError: bad input shape ()

Could someone please try to look at this?

Thank you for your time and research!

Reply via email to