Hello, Michael,
I discovered that the problem is "two columns of data are put together" and
"are recognised as one column".
This is very strange. I would like to understand the subject well.
And, how many ways are there to investigate into the nature of objects
dynamically?
Some object types only get shown as an object. Are there anything to be typed
in Python, to reveal objects.
Regards.
David
On Saturday, 14 May 2016, 4:30, Michael Selik <[email protected]>
wrote:
What were you hoping to get from ``df[0]``?When you say it "yields nothing" do
you mean it raised an error? What was the error message?
Have you tried a Google search for "pandas set
index"?http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html
On Fri, May 13, 2016 at 11:18 PM David Shi <[email protected]> wrote:
Hello, Michael,
I tried to discover the problem.
df[0] yields nothingdf[1] yields nothingdf[2] yields nothing
However, df[3] gives the following:sid
-9223372036854775808 NaN
1 133738.70
4 295256.11
5 137733.09
6 409413.58
8 269600.97
9 12852.94
Can we split this back to normal? or turn it into a dictionary, so that I can
put values back properly.
I like to use sid as index, some way.
Regards.
David
On Friday, 13 May 2016, 22:58, Michael Selik <[email protected]>
wrote:
What have code you tried? What error message are you receiving?
On Fri, May 13, 2016, 5:54 PM David Shi <[email protected]> wrote:
Hello, Michael,
How to convert a float type column into an integer or label or string type?
On Friday, 13 May 2016, 22:02, Michael Selik <[email protected]>
wrote:
To clarify that you're specifying the index as a label, use df.iloc
>>> df = pd.DataFrame({'X': range(4)}, index=list('abcd')) >>> df
X a 0 b 1 c 2 d 3 >>> df.loc['a'] X 0 Name: a,
dtype: int64 >>> df.iloc[0] X 0 Name: a, dtype: int64
On Fri, May 13, 2016 at 4:54 PM David Shi <[email protected]> wrote:
Dear Michael,
To avoid complication, I only groupby using one column.
It is OK now. But, how to refer to new row index? How do I use floating index?
Float64Index([ 1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 16.0,
17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0,
28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0,
39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0,
51.0, 53.0, 54.0, 55.0, 56.0],
dtype='float64', name=u'StateFIPS')
Regards.
David
On Friday, 13 May 2016, 21:43, Michael Selik <[email protected]>
wrote:
Here's an example.
>>> import pandas as pd >>> df = pd.DataFrame({'group': list('AB') * 2,
'data': range(4)}, index=list('wxyz')) >>> df data group w 0
A x 1 B y 2 A z 3 B >>> df =
df.reset_index() >>> df index data group 0 w 0 A 1
x 1 B 2 y 2 A 3 z 3 B >>>
df.groupby('group').max() index data group A y 2
B z 3
If that doesn't help, you'll need to explain what you're trying to accomplish
in detail -- what variables you started with, what transformations you want to
do, and what variables you hope to have when finished.
On Fri, May 13, 2016 at 4:36 PM David Shi <[email protected]> wrote:
Hello, Michael,
I changed groupby with one column.
The index is different.
Index([ u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC',
u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN',
u'KS', u'KY', u'LA', u'MA', u'MD', u'ME', u'MI',
u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE',
u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK',
u'OR', u'PA', u'RI', u'SC', u'SD', u'State', u'TN',
u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV',
u'WY'],
dtype='object', name=0)
How to use this index?
Regards.
David
On Friday, 13 May 2016, 21:19, David Shi <[email protected]> wrote:
Hello, Michael,
I typed in df.index
I got the followingMultiIndex(levels=[[1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0,
11.0, 12.0, 13.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0,
26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0,
39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, 53.0,
54.0, 55.0, 56.0], [u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', u'DE',
u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA', u'MA', u'MD',
u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE', u'NH', u'NJ',
u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA', u'RI', u'SC', u'SD', u'State',
u'TN', u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV', u'WY']],
labels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], [0, 2, 1, 3, 4, 5, 7, 6, 8, 9,
11, 12, 13, 10, 14, 15, 16, 19, 18, 17, 20, 21, 23, 22, 24, 27, 31, 28, 29, 30,
32, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, 45, 44, 46, 48, 47, 49]],
names=[u'StateFIPS', 0])Regards.
David
On Friday, 13 May 2016, 21:11, David Shi <[email protected]> wrote:
Dear Michael,
I have done a number of operation in between.
Providing that information does not help you
How to reset index after grouping and various operations is of interest.
How to type in a command to find out its current dataframe?
Regards.
David
On Friday, 13 May 2016, 20:58, Michael Selik <[email protected]>
wrote:
Just in case I misunderstood, why don't you make a little example of before
and after the grouping? This mailing list does not accept attachments, so
you'll have to make do with pasting a few rows of comma-separated or
tab-separated values.
On Fri, May 13, 2016 at 3:56 PM Michael Selik <[email protected]> wrote:
In order to preserve your index after the aggregation, you need to make sure it
is considered a data column (via reset_index) and then choose how your
aggregation will operate on that column.
On Fri, May 13, 2016 at 3:29 PM David Shi <[email protected]> wrote:
Hello, Michael,
Why reset_index before grouping?
Regards.
David
On Friday, 13 May 2016, 17:57, Michael Selik <[email protected]> wrote:
On Fri, May 13, 2016 at 12:27 PM David Shi via Python-list
<[email protected]> wrote:
I lost my indexes after grouping in Pandas.
I managed to rest_index and got back the index column.
But How can I get back a index row?
Was the grouping an aggregation? If so, the original indexes are meaningless.
What you could do is reset_index before the grouping and when you aggregate
decide how to handle the formerly-known-as-index column (min, max, mean, ?).
--
https://mail.python.org/mailman/listinfo/python-list