Skip to site navigation (Press enter)

Re: How to expand and flatten a nested of list of dictionaries of varied lengths?

dn via Python-list Sun, 18 Oct 2020 16:55:02 -0700

If I may, a couple of items of list-etiquette (polite behavior), as Iunderstand them:1 please reply to the list (cf only myself) because @Mats (who respondedearlier) and others on this list are much smarter than me, and might beable to help you more quickly2 top-posting seems to take the form 'answer, then question' which isillogical to everyone, except apparently Microsoft. It is better to havethe conversation 'develop' as it proceeds - all the early information atthe beginning, and the more detailed towards the 'end'. That is not tosay that we can't "snip" or 'do some gardening', to remove unnecessaryor erroneous material, as the conversation progresses. You will notice(as below) that this also enables a posting with multiple questions, tobe discussed point-by-point.


Now to work...


> On Sun, 18 Oct 2020 at 21:48, dn via Python-list <python-list@python.org
> <mailto:python-list@python.org>> wrote:
>
>     On 19/10/2020 09:09, Shaozhong SHI wrote:
>      > Even worse is that, in some cases, an addition called
>     serviceRatings as a
>      > key occur with new data unexpectedly.
>
>     "Even worse" than what?
>
>     Do you need to keep a list of acceptable/applicable/available keys?
>     (and reject or deal with others in some alternate fashion)
>
>

> > How to produce a robust Python/Panda script to coping with allthese?

...

[I often use ellipsis to indicate that I have snipped 'stuff in themiddle', others are more overt and will write "<snip>" or similar]

> You may find it helpful to use the pprint ("pretty printing"library to

>     print data-structures in a more readable/structured format).
>

> To "flatten" a dictionary, you must first be sure that there willbe no> keys that will clash (else the second entry will completelyreplace the

>     first, without trace).
>

> Thus, we will need to understand more about this particulardefinition> of "flatten" in relation to the range of incoming data. Perhapsexplain

>     them in English first...

On 19/10/2020 12:14, Shaozhong SHI wrote:

Hi, DN,

This is the result of pprint.


[{u'overall': {u'keyQuestionRatings': [{u'name': u'Safe',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Well-led',

u'rating': u'Requiresimprovement'}],

               u'rating': u'Requires improvement'},
  u'reportDate': u'2019-10-04',
  u'reportLinkId': u'63ff05ec-4d31-406e-83de-49a271cfdc43'},
 {u'overall': {u'keyQuestionRatings': [{u'name': u'Safe',
                                        u'rating': u'Good'},
                                       {u'name': u'Well-led',
                                        u'rating': u'Good'},
                                       {u'name': u'Caring',
                                        u'rating': u'Good'},
                                       {u'name': u'Responsive',
                                        u'rating': u'Good'},
                                       {u'name': u'Effective',

u'rating': u'Requiresimprovement'}],

               u'rating': u'Good'},
  u'reportDate': u'2017-09-08',
  u'reportLinkId': u'4f20da40-89a4-4c45-a7f9-bfd52b48f286'},
 {u'overall': {u'keyQuestionRatings': [{u'name': u'Safe',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Well-led',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Caring',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Responsive',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Effective',
                                        u'rating': u'Good'}],
               u'rating': u'Requires improvement'},
  u'reportDate': u'2016-06-11',
  u'reportLinkId': u'0cc4226b-401e-4f0f-ba35-062cbadffa8f'},
 {u'overall': {u'keyQuestionRatings': [{u'name': u'Safe',
                                        u'rating': u'Good'},
                                       {u'name': u'Well-led',
                                        u'rating': u'Good'},
                                       {u'name': u'Caring',
                                        u'rating': u'Good'},
                                       {u'name': u'Responsive',

u'rating': u'Requiresimprovement'},

                                       {u'name': u'Effective',
                                        u'rating': u'Good'}],
               u'rating': u'Good'},
  u'reportDate': u'2015-01-12',
  u'reportLinkId': u'a11c1e52-ddfd-4cd8-8b56-1b96ac287c96'}]

Well done! This looks so much better, and more to the point, it iseasier for 'us' to see the structure - but oh dear, doesn't emailwrapping make our lives difficult!

Normally, it is like this.
But sometimes, serviceRatings is added to the key list - [u'overall',u'reportDate', u'reportLinkId']
That is what I meant about dynamically growing tree.

OK, (and only you/your user can answer this question) why do all theexamples (above) not have a service-rating?

I am wondering if the use of the word "unexpectedly" has translatedaccurately between languages - if a data-item is part of the data-input,then our code must be able to handle it or "clean" it, as specified (bythe user).


- are you able to add a service-rating to each "overall" entry?

- where service-ratings are not currently-available, would it beacceptable to add the field with a value of None? (or some other"sentinel-value"- if the analysis-phase does not consider service-ratings, can we writecode to read the field from the data-source, but discard it whilstloading everything else into a Pandas matrix?

How best to handle this?

This requires understanding how the service-rating value will be used inthe analysis, and thus how relevant records may be selected/ignored.Just because it features in the data, doesn't mean it needs to beincluded in the analysis!



Have I understood the question?
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list