Re: ValueError: arrays must all be same length

2020-10-05 Thread Tim Williams
On Mon, Oct 5, 2020 at 6:47 AM Shaozhong SHI  wrote:

>
> Hi, I managed to flatten it with json_normalize first.
>
> from pandas.io.json import json_normalize
> atable = json_normalize(d)
> atable
>
> Then, I got this table.
>
> brandId brandName careHome constituency
> currentRatings.overall.keyQuestionRatings currentRatings.overall.rating
> currentRatings.overall.reportDate currentRatings.overall.reportLinkId
> currentRatings.reportDate dormancy ... providerId region registrationDate
> registrationStatus regulatedActivities relationships reports specialisms
> type uprn
> 0 BD510 BRAND MACC Care Y Birmingham, Northfield [{u'reportDate':
> u'2020-10-01', u'rating': u'R... Requires improvement 2020-10-01
> 1157c975-c2f1-423e-a2b4-66901779e014 2020-10-01 N ... 1-101641521 West
> Midlands 2013-12-16 Registered [{u'code': u'RA2', u'name':
> u'Accommodation
>
> Then, I tried to expand the column
> of currentRatings.overall.keyQuestionRatings, with
>
> mydf =
> pd.DataFrame.from_dict(atable['currentRatings.overall.keyQuestionRatings'][0])
> mydf
>
> Then, I got another table.
>
> name rating reportDate reportLinkId
> 0 Safe Requires improvement 2020-10-01
> 1157c975-c2f1-423e-a2b4-66901779e014
> 1 Well-led Requires improvement 2020-10-01
> 1157c975-c2f1-423e-a2b4-66901779e014
> 2 Caring Good 2019-10-04 63ff05ec-4d31-406e-83de-49a271cfdc43
> 3 Responsive Good 2019-10-04 63ff05ec-4d31-406e-83de-49a271cfdc43
> 4 Effective Requires improvement 2019-10-04
> 63ff05ec-4d31-406e-83de-49a271cfdc43
>
>
> How can I re-arrange to get a flatten table?
>
> Apparently, the nested data is another table.
>
> Regards,
>
> Shao
>
>
> I'm fairly new to pandas myself. Can't help there. You may want to post
this on Stackoverflow, or look for a similar issue on github.

https://stackoverflow.com/questions/tagged/pandas+json
https://github.com/pandas-dev/pandas/issues




>
> On Sun, 4 Oct 2020 at 13:55, Tim Williams  wrote:
>
>> On Sun, Oct 4, 2020 at 8:39 AM Tim Williams  wrote:
>>
>> >
>> >
>> > On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
>> > wrote:
>> >
>> >> Hello,
>> >>
>> >> I got a json response from an API and tried to use pandas to put data
>> into
>> >> a dataframe.
>> >>
>> >> However, I kept getting this ValueError: arrays must all be same
>> length.
>> >>
>> >> Can anyone help?
>> >>
>> >> The following is the json text.  Regards, Shao
>> >>
>> >> (snip json_text)
>> >
>> >
>> >> import pandas as pd
>> >>
>> >> import json
>> >>
>> >> j = json.JSONDecoder().decode(req.text)  ###req.json
>> >>
>> >> df = pd.DataFrame.from_dict(j)
>> >>
>> >
>> > I copied json_text into a Jupyter notebook and got the same error trying
>> > to convert this into a pandas DataFrame:When I tried to copy this into a
>> > string, I got an error,, but without enclosing the paste in quotes, I
>> got
>> > the dictionary.
>> >
>> >
>> (delete long response output)
>>
>>
>> > for k in json_text.keys():
>> > if isinstance(json_text[k], list):
>> > print(k, len(json_text[k]))
>> >
>> > relationships 0
>> > locationTypes 0
>> > regulatedActivities 2
>> > gacServiceTypes 1
>> > inspectionCategories 1
>> > specialisms 4
>> > inspectionAreas 0
>> > historicRatings 4
>> > reports 5
>> >
>> > HTH,.
>> >
>> >
>> This may also be more of a pandas issue.
>>
>> json.loads(json.dumps(json_text))
>>
>> has a successful round-trip
>>
>>
>> > --
>> >> https://mail.python.org/mailman/listinfo/python-list
>> >>
>> >
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-05 Thread Shaozhong SHI
Hi, I managed to flatten it with json_normalize first.

from pandas.io.json import json_normalize
atable = json_normalize(d)
atable

Then, I got this table.

brandId brandName careHome constituency
currentRatings.overall.keyQuestionRatings currentRatings.overall.rating
currentRatings.overall.reportDate currentRatings.overall.reportLinkId
currentRatings.reportDate dormancy ... providerId region registrationDate
registrationStatus regulatedActivities relationships reports specialisms
type uprn
0 BD510 BRAND MACC Care Y Birmingham, Northfield [{u'reportDate':
u'2020-10-01', u'rating': u'R... Requires improvement 2020-10-01
1157c975-c2f1-423e-a2b4-66901779e014 2020-10-01 N ... 1-101641521 West
Midlands 2013-12-16 Registered [{u'code': u'RA2', u'name': u'Accommodation

Then, I tried to expand the column
of currentRatings.overall.keyQuestionRatings, with

mydf =
pd.DataFrame.from_dict(atable['currentRatings.overall.keyQuestionRatings'][0])
mydf

Then, I got another table.

name rating reportDate reportLinkId
0 Safe Requires improvement 2020-10-01 1157c975-c2f1-423e-a2b4-66901779e014
1 Well-led Requires improvement 2020-10-01
1157c975-c2f1-423e-a2b4-66901779e014
2 Caring Good 2019-10-04 63ff05ec-4d31-406e-83de-49a271cfdc43
3 Responsive Good 2019-10-04 63ff05ec-4d31-406e-83de-49a271cfdc43
4 Effective Requires improvement 2019-10-04
63ff05ec-4d31-406e-83de-49a271cfdc43


How can I re-arrange to get a flatten table?

Apparently, the nested data is another table.

Regards,

Shao



On Sun, 4 Oct 2020 at 13:55, Tim Williams  wrote:

> On Sun, Oct 4, 2020 at 8:39 AM Tim Williams  wrote:
>
> >
> >
> > On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
> > wrote:
> >
> >> Hello,
> >>
> >> I got a json response from an API and tried to use pandas to put data
> into
> >> a dataframe.
> >>
> >> However, I kept getting this ValueError: arrays must all be same length.
> >>
> >> Can anyone help?
> >>
> >> The following is the json text.  Regards, Shao
> >>
> >> (snip json_text)
> >
> >
> >> import pandas as pd
> >>
> >> import json
> >>
> >> j = json.JSONDecoder().decode(req.text)  ###req.json
> >>
> >> df = pd.DataFrame.from_dict(j)
> >>
> >
> > I copied json_text into a Jupyter notebook and got the same error trying
> > to convert this into a pandas DataFrame:When I tried to copy this into a
> > string, I got an error,, but without enclosing the paste in quotes, I got
> > the dictionary.
> >
> >
> (delete long response output)
>
>
> > for k in json_text.keys():
> > if isinstance(json_text[k], list):
> > print(k, len(json_text[k]))
> >
> > relationships 0
> > locationTypes 0
> > regulatedActivities 2
> > gacServiceTypes 1
> > inspectionCategories 1
> > specialisms 4
> > inspectionAreas 0
> > historicRatings 4
> > reports 5
> >
> > HTH,.
> >
> >
> This may also be more of a pandas issue.
>
> json.loads(json.dumps(json_text))
>
> has a successful round-trip
>
>
> > --
> >> https://mail.python.org/mailman/listinfo/python-list
> >>
> >
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-04 Thread Tim Williams
On Sun, Oct 4, 2020 at 8:39 AM Tim Williams  wrote:

>
>
> On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
> wrote:
>
>> Hello,
>>
>> I got a json response from an API and tried to use pandas to put data into
>> a dataframe.
>>
>> However, I kept getting this ValueError: arrays must all be same length.
>>
>> Can anyone help?
>>
>> The following is the json text.  Regards, Shao
>>
>> (snip json_text)
>
>
>> import pandas as pd
>>
>> import json
>>
>> j = json.JSONDecoder().decode(req.text)  ###req.json
>>
>> df = pd.DataFrame.from_dict(j)
>>
>
> I copied json_text into a Jupyter notebook and got the same error trying
> to convert this into a pandas DataFrame:When I tried to copy this into a
> string, I got an error,, but without enclosing the paste in quotes, I got
> the dictionary.
>
>
(delete long response output)


> for k in json_text.keys():
> if isinstance(json_text[k], list):
> print(k, len(json_text[k]))
>
> relationships 0
> locationTypes 0
> regulatedActivities 2
> gacServiceTypes 1
> inspectionCategories 1
> specialisms 4
> inspectionAreas 0
> historicRatings 4
> reports 5
>
> HTH,.
>
>
This may also be more of a pandas issue.

json.loads(json.dumps(json_text))

has a successful round-trip


> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-04 Thread Tim Williams
On Fri, Oct 2, 2020 at 11:00 AM Shaozhong SHI 
wrote:

> Hello,
>
> I got a json response from an API and tried to use pandas to put data into
> a dataframe.
>
> However, I kept getting this ValueError: arrays must all be same length.
>
> Can anyone help?
>
> The following is the json text.  Regards, Shao
>
> (snip json_text)


> import pandas as pd
>
> import json
>
> j = json.JSONDecoder().decode(req.text)  ###req.json
>
> df = pd.DataFrame.from_dict(j)
>

I copied json_text into a Jupyter notebook and got the same error trying to
convert this into a pandas DataFrame:When I tried to copy this into a
string, I got an error,, but without enclosing the paste in quotes, I got
the dictionary.

dir(json_text)
['__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

pd.DataFrame(json_text)

---

ValueErrorTraceback (most recent call last)
 in 
> 1 pd.DataFrame(json_text)

D:\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data,
index, columns, dtype, copy)
433 )
434 elif isinstance(data, dict):
--> 435 mgr = init_dict(data, index, columns, dtype=dtype)
436 elif isinstance(data, ma.MaskedArray):
437 import numpy.ma.mrecords as mrecords

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
init_dict(data, index, columns, dtype)
252 arr if not is_datetime64tz_dtype(arr) else arr.copy()
for arr in arrays
253 ]
--> 254 return arrays_to_mgr(arrays, data_names, index, columns,
dtype=dtype)
255
256

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
arrays_to_mgr(arrays, arr_names, index, columns, dtype)
 62 # figure out the index, if necessary
 63 if index is None:
---> 64 index = extract_index(arrays)
 65 else:
 66 index = ensure_index(index)

D:\anaconda3\lib\site-packages\pandas\core\internals\construction.py in
extract_index(data)
363 lengths = list(set(raw_lengths))
364         if len(lengths) > 1:
--> 365 raise ValueError("arrays must all be same length")
366
367 if have_dicts:

ValueError: arrays must all be same length


I got a different error trying json.loads(str(json_text)),
---
JSONDecodeError   Traceback (most recent call last)
 in 
> 1 json.loads(str(json_text))

D:\anaconda3\lib\json\__init__.py in loads(s, cls, object_hook,
parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
355 parse_int is None and parse_float is None and
356 parse_constant is None and object_pairs_hook is None
and not kw):
--> 357 return _default_decoder.decode(s)
358 if cls is None:
359 cls = JSONDecoder

D:\anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):

D:\anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
351 """
352 try:
--> 353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
355 raise JSONDecodeError("Expecting value", s, err.value)
from None

JSONDecodeError: Expecting property name enclosed in double quotes: line 1
column 2 (char 1)

I think the solution is to fix the arrays so that the lengths match.

for k in json_text.keys():
if isinstance(json_text[k], list):
print(k, len(json_text[k]))

relationships 0
locationTypes 0
regulatedActivities 2
gacServiceTypes 1
inspectionCategories 1
specialisms 4
inspectionAreas 0
historicRatings 4
reports 5

HTH,.

-- 
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-04 Thread Christian Gollwitzer

Am 02.10.20 um 14:34 schrieb Shaozhong SHI:

Hello,

I got a json response from an API and tried to use pandas to put data into
a dataframe.

However, I kept getting this ValueError: arrays must all be same length.

Can anyone help?

The following is the json text.  


What do you expect the dataframe to look like? dataframes are 2D tables, 
JSON is a tree.


Christian

--
https://mail.python.org/mailman/listinfo/python-list


Re: ValueError: arrays must all be same length

2020-10-03 Thread Peter Pearson
On Fri, 2 Oct 2020 13:34:46 +0100, Shaozhong SHI  wrote:
> Hello,
>
> I got a json response from an API and tried to use pandas to put data into
> a dataframe.
>
> However, I kept getting this ValueError: arrays must all be same length.
>
> Can anyone help?
>
> The following is the json text.  Regards, Shao
>
> {
>   "locationId": "1-1004508435",
>   "providerId": "1-101641521",
...
[huge block removed]
...
>   "reportType": "Location"
> }
>   ]
> }
>
> In [ ]:
>
>
> In [25]:
> j
>
>
> import pandas as pd
>
> import json
>
> j = json.JSONDecoder().decode(req.text)  ###req.json
>
> df = pd.DataFrame.from_dict(j)

An important programming skill is paring back failing code to
create the smallest example that exhibits the failure.  Often, the
paring process reveals the problem; and if it doesn't, the shorter
code is more likely to attract help.

-- 
To email me, substitute nowhere->runbox, invalid->com.
-- 
https://mail.python.org/mailman/listinfo/python-list


ValueError: arrays must all be same length

2020-10-02 Thread Shaozhong SHI
Hello,

I got a json response from an API and tried to use pandas to put data into
a dataframe.

However, I kept getting this ValueError: arrays must all be same length.

Can anyone help?

The following is the json text.  Regards, Shao

{
  "locationId": "1-1004508435",
  "providerId": "1-101641521",
  "organisationType": "Location",
  "type": "Social Care Org",
  "name": "Meadow Rose Nursing Home",
  "brandId": "BD510",
  "brandName": "BRAND MACC Care",
  "onspdCcgCode": "E38000220",
  "onspdCcgName": "NHS Birmingham and Solihull CCG",
  "odsCode": "VM4G9",
  "uprn": "100070537642",
  "registrationStatus": "Registered",
  "registrationDate": "2013-12-16",
  "dormancy": "N",
  "numberOfBeds": 56,
  "postalAddressLine1": "96 The Roundabout",
  "postalAddressTownCity": "Birmingham",
  "postalAddressCounty": "West Midlands",
  "region": "West Midlands",
  "postalCode": "B31 2TX",
  "onspdLatitude": 52.399843,
  "onspdLongitude": -1.989241,
  "careHome": "Y",
  "inspectionDirectorate": "Adult social care",
  "mainPhoneNumber": "01214769808",
  "constituency": "Birmingham, Northfield",
  "localAuthority": "Birmingham",
  "lastInspection": {
"date": "2020-06-24"
  },
  "lastReport": {
"publicationDate": "2020-10-01"
  },
  "relationships": [

  ],
  "locationTypes": [

  ],
  "regulatedActivities": [
{
  "name": "Accommodation for persons who require nursing or personal care",
  "code": "RA2",
  "contacts": [
{
  "personTitle": "Mr",
  "personGivenName": "Steven",
  "personFamilyName": "Kazembe",
  "personRoles": [
"Registered Manager"
  ]
}
  ]
},
{
  "name": "Treatment of disease, disorder or injury",
  "code": "RA5",
  "contacts": [
{
  "personTitle": "Mr",
  "personGivenName": "Steven",
  "personFamilyName": "Kazembe",
  "personRoles": [
"Registered Manager"
  ]
}
  ]
}
  ],
  "gacServiceTypes": [
{
  "name": "Nursing homes",
  "description": "Care home service with nursing"
}
  ],
  "inspectionCategories": [
{
  "code": "S1",
  "primary": "true",
  "name": "Residential social care"
}
  ],
  "specialisms": [
{
  "name": "Caring for adults over 65 yrs"
},
{
  "name": "Caring for adults under 65 yrs"
},
{
  "name": "Dementia"
},
{
  "name": "Physical disabilities"
}
  ],
  "inspectionAreas": [

  ],
  "currentRatings": {
"overall": {
  "rating": "Requires improvement",
  "reportDate": "2020-10-01",
  "reportLinkId": "1157c975-c2f1-423e-a2b4-66901779e014",
  "useOfResources": {

  },
  "keyQuestionRatings": [
{
  "name": "Safe",
  "rating": "Requires improvement",
  "reportDate": "2020-10-01",
  "reportLinkId": "1157c975-c2f1-423e-a2b4-66901779e014"
},
{
  "name": "Well-led",
  "rating": "Requires improvement",
  "reportDate": "2020-10-01",
  "reportLinkId": "1157c975-c2f1-423e-a2b4-66901779e014"
},
{
  "name": "Caring",
  "rating": "Good",
  "reportDate": "2019-10-04",
  "reportLinkId": "63ff05ec-4d31-406e-83de-49a271cfdc43"
},
{
  "name": "Responsive",
  "rating": "Good",
  "reportDate": "2019-10-04",
  "reportLinkId": "63ff05ec-4d31-406e-83de-49a271cfdc43&qu