Dear Mr. Lemaitre Thanks a lot for sharing your time and knowledge. Unfortunately, it throws the following error:
Traceback (most recent call last): 119 File "D:/mifs-master_2/MU/learning-from-imbalanced-classes-master/learning-from-imbalanced-classes-master/continuous/Final Logit/SMOTENC/logit-final - Copy.py", line 419, in <module> 41 pipeline_with_resampling = make_pipeline(SMOTENC(categorical_features=cat_indices1), pipeline) File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 594, in make_pipeline return Pipeline(_name_estimators(steps), memory=memory) File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 119, in __init__ self._validate_steps() File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 167, in _validate_steps " '%s' (type %s) doesn't" % (t, type(t))) TypeError: All intermediate steps should be transformers and implement fit and transform. 'SMOTENC(categorical_features=['x95', 'x97', 'x99', 'x100', 'x121_1', 'x121_2', 'x121_3', 'x121_4', 'x121_5', 'x121_6', 'x121_7', 'x121_8', 'x121_9', 'x121_10', 'x121_11', 'x121_12', 'x121_13', 'x121_14', 'x121_15', 'x121_16', 'x121_17', 'x121_18', 'x121_19', 'x121_20', 'x121_21', 'x121_22', 'x121_23', 'x121_24', 'x121_25', 'x121_26', 'x121_27', 'x121_28', 'x121_29', 'x121_30', 'x121_31', 'x121_32', 'x121_33', 'x121_34', 'x121_35', 'x121_36', 'x121_37'], k_neighbors=5, n_jobs=1, random_state=None, sampling_strategy='auto')' (type <class 'imblearn.over_sampling._smote.SMOTENC'>) doesn't Thanks in advance. Best regards, On Mon, Jan 21, 2019 at 2:26 PM Guillaume Lemaître <g.lemaitr...@gmail.com> wrote: > SMOTENC will internally one hot encode the features, generate new > features, and finally decode. > So you need to do something like: > > > from imblearn.pipeline import make_pipeline, Pipeline > > num_indices1 = list(X.iloc[:,np.r_[0:94,95,97,100:123]].columns.values) > cat_indices1 = list(X.iloc[:,np.r_[94,96,98,99,123:160]].columns.values) > print(len(num_indices1)) > print(len(cat_indices1)) > > pipeline=Pipeline(steps= [ > # Categorical features > ('feature_processing', FeatureUnion(transformer_list = [ > ('categorical', MultiColumn(cat_indices1)), > > #numeric > ('numeric', Pipeline(steps = [ > ('select', MultiColumn(num_indices1)), > ('scale', StandardScaler()) > ])) > ])), > ('clf', rg) > ] > ) > > pipeline_with_resampling = > make_pipeline(SMOTENC(categorical_features=cat_indices_1), pipeline) > > > > > On Sun, 20 Jan 2019 at 18:05, S Hamidizade <hamidizad...@gmail.com> wrote: > >> Dear Scikit-learners >> Hi. >> >> I would greatly appreciate if you could let me know how to use >> SMOTENC. I wrote: >> >> num_indices1 = list(X.iloc[:,np.r_[0:94,95,97,100:123]].columns.values) >> cat_indices1 = list(X.iloc[:,np.r_[94,96,98,99,123:160]].columns.values) >> print(len(num_indices1)) >> print(len(cat_indices1)) >> >> pipeline=Pipeline(steps= [ >> # Categorical features >> ('feature_processing', FeatureUnion(transformer_list = [ >> ('categorical', MultiColumn(cat_indices1)), >> >> #numeric >> ('numeric', Pipeline(steps = [ >> ('select', MultiColumn(num_indices1)), >> ('scale', StandardScaler()) >> ])) >> ])), >> ('clf', rg) >> ] >> ) >> >> Therefore, as it is indicated I have 5 categorical features. Really, >> indices 123 to 160 are related to one categorical feature with 37 possible >> values which is converted into 37 columns using get_dummies. >> Sorry, I think SMOTENC should be inserted before the classifier ('clf', >> reg) but I don't know how to define "categorical_features" in SMOTENC. >> Besides, could you please let me know where to use imblearn.pipeline? >> >> Thanks in advance. >> Best regards, >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > > -- > Guillaume Lemaitre > INRIA Saclay - Parietal team > Center for Data Science Paris-Saclay > https://glemaitre.github.io/ > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn