Hi, Does your code run on a sample of the data? Does your code have categorical data in it? If so: https://pandas.pydata.org/pandas-docs/stable/categorical.html. Also, check out http://www.pytables.org.
Albert-Jan ________________________________ From: Python-list <python-list-bounces+sjeik_appie=hotmail....@python.org> on behalf of Bhaskar Dhariyal <dhariyalbhas...@gmail.com> Sent: Thursday, June 29, 2017 4:34:56 AM To: python-list@python.org Subject: Re: Combining 2 data series into one On Wednesday, 28 June 2017 23:43:57 UTC+5:30, Albert-Jan Roskam wrote: > (sorry for top posting) > Yes, I'd try pd.concat([df1, df2]). > Or this: > df['both_names'] = df.apply(lambda row: row.name + ' ' + row.surname, axis=1) > ________________________________ > From: Python-list <python-list-bounces+sjeik_appie=hotmail....@python.org> on > behalf of Paul Barry <paul.james.ba...@gmail.com> > Sent: Wednesday, June 28, 2017 12:30:25 PM > To: Bhaskar Dhariyal > Cc: python-list@python.org > Subject: Re: Combining 2 data series into one > > Maybe look at using .concat instead of + > > See: > http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.06-Concat-And-Append.ipynb > > On 28 June 2017 at 13:02, Paul Barry <paul.james.ba...@gmail.com> wrote: > > > > > Maybe try your code on a sub-set of your data - perhaps 1000 lines of > > data? - to see if that works. > > > > Anyone else on the list suggest anything to try here? > > > > On 28 June 2017 at 12:50, Bhaskar Dhariyal <dhariyalbhas...@gmail.com> > > wrote: > > > >> No it didn't work. I am getting memory error. Using 32GB RAM system > >> > >> On Wed, Jun 28, 2017 at 5:17 PM, Paul Barry <paul.james.ba...@gmail.com> > >> wrote: > >> > >>> On the line that's failing, your code is this: > >>> > >>> combinedX=combinedX+dframe['tf'] > >>> > >>> which uses combinedX on both sides of the assignment statement - note > >>> that Python is reporting a 'MemoryError", which may be happening due to > >>> this "double use" (maybe). What happens if you create a new dataframe, > >>> like this: > >>> > >>> newX = combinedX + dframe['tf'] > >>> > >>> Regardless, it looks like you are doing a dataframe merge. Jake V's > >>> book has an excellent section on it here: http://nbviewer.jupyter. > >>> org/github/jakevdp/PythonDataScienceHandbook/blob/master/not > >>> ebooks/03.07-Merge-and-Join.ipynb - this should take about 20 minutes > >>> to read, and may be of use to you. > >>> > >>> Paul. > >>> > >>> > >>> > >>> On 28 June 2017 at 12:19, Bhaskar Dhariyal <dhariyalbhas...@gmail.com> > >>> wrote: > >>> > >>>> On Wednesday, 28 June 2017 14:43:48 UTC+5:30, Paul Barry wrote: > >>>> > This should do it: > >>>> > > >>>> > >>> import pandas as pd > >>>> > >>> > >>>> > >>> df1 = pd.DataFrame(['bhaskar', 'Rohit'], columns=['first_name']) > >>>> > >>> df1 > >>>> > first_name > >>>> > 0 bhaskar > >>>> > 1 Rohit > >>>> > >>> df2 = pd.DataFrame(['dhariyal', 'Gavval'], columns=['last_name']) > >>>> > >>> df2 > >>>> > last_name > >>>> > 0 dhariyal > >>>> > 1 Gavval > >>>> > >>> df = pd.DataFrame() > >>>> > >>> df['name'] = df1['first_name'] + ' ' + df2['last_name'] > >>>> > >>> df > >>>> > name > >>>> > 0 bhaskar dhariyal > >>>> > 1 Rohit Gavval > >>>> > >>> > >>>> > > >>>> > Again, I draw your attention to Jake VanderPlas's excellent book, > >>>> which is > >>>> > available for free on the web. All of these kind of data > >>>> manipulations are > >>>> > covered there: https://github.com/jakevdp/PythonDataScienceHandbook > >>>> - the > >>>> > hard copy is worth owning too (if you plan to do a lot of work using > >>>> > numpy/pandas). > >>>> > > >>>> > I'd also recommend the upcoming 2nd edition of Wes McKinney's "Python > >>>> for > >>>> > Data Analysis" book - I've just finished tech reviewing it for > >>>> O'Reilly, > >>>> > and it is very good, too - highly recommended. > >>>> > > >>>> > Regards. > >>>> > > >>>> > Paul. > >>>> > > >>>> > On 28 June 2017 at 07:11, Bhaskar Dhariyal <dhariyalbhas...@gmail.com > >>>> > > >>>> > wrote: > >>>> > > >>>> > > Hi! > >>>> > > > >>>> > > I have 2 dataframe i.e. df1['first_name'] and df2['last_name']. I > >>>> want to > >>>> > > make it as df['name']. How to do it using pandas dataframe. > >>>> > > > >>>> > > first_name > >>>> > > ---------- > >>>> > > bhaskar > >>>> > > Rohit > >>>> > > > >>>> > > > >>>> > > last_name > >>>> > > ----------- > >>>> > > dhariyal > >>>> > > Gavval > >>>> > > > >>>> > > should appear as > >>>> > > > >>>> > > name > >>>> > > ---------- > >>>> > > bhaskar dhariyal > >>>> > > Rohit Gavval > >>>> > > > >>>> > > > >>>> > > > >>>> > > Thanks > >>>> > > -- > >>>> > > https://mail.python.org/mailman/listinfo/python-list > >>>> > > > >>>> > > >>>> > > >>>> > > >>>> > -- > >>>> > Paul Barry, t: @barrypj <https://twitter.com/barrypj> - w: > >>>> > http://paulbarry.itcarlow.ie - e: paul.ba...@itcarlow.ie > >>>> > Lecturer, Computer Networking: Institute of Technology, Carlow, > >>>> Ireland. > >>>> > >>>> https://drive.google.com/open?id=0Bw2Avni0DUa3aFJKdC1Xd2trM2c > >>>> link to code > >>>> -- > >>>> https://mail.python.org/mailman/listinfo/python-list > >>>> > >>> > >>> > >>> > >>> -- > >>> Paul Barry, t: @barrypj <https://twitter.com/barrypj> - w: > >>> http://paulbarry.itcarlow.ie - e: paul.ba...@itcarlow.ie > >>> Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > >>> > >> > >> > > > > > > -- > > Paul Barry, t: @barrypj <https://twitter.com/barrypj> - w: > > http://paulbarry.itcarlow.ie - e: paul.ba...@itcarlow.ie > > Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > > > > > > -- > Paul Barry, t: @barrypj <https://twitter.com/barrypj> - w: > http://paulbarry.itcarlow.ie - e: paul.ba...@itcarlow.ie > Lecturer, Computer Networking: Institute of Technology, Carlow, Ireland. > -- > https://mail.python.org/mailman/listinfo/python-list Hi Albert! Thanks for replying. That issue was resolved. But I m struck with a new problem. I generated tfidf representation for pandas dataframe where each row contains some text. I also had some numerical feature which I wanted to combine with tfidf matrix. But this is giving memory error. -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list