Re: How to copy paragraphs (with number formatting) and images from Words (.docx) and paste into Excel (.xlsx) using Python

2020-03-23 Thread Beverly Pope
On Mar 22, 2020, at 11:12 PM, A S  wrote:
> 
> On Monday, 23 March 2020 01:58:38 UTC+8, Beverly Pope  wrote:
>>> On Mar 22, 2020, at 9:47 AM, A S >> > wrote:
>>> 
>>> I can't seem to paste pictures into this discussion so please see both my 
>>> current and desired Excel output here:
>>> 
>>> https://stackoverflow.com/questions/60800494/how-to-copy-paragraphs-with-number-formatting-and-images-from-words-docx-an
>>>  
>>> >>  
>>> >
>> Did you try using the 2 part answer on the stackoverflow webpage?
>> 
>> Bev in TX
> 
> I'm able to get the paragraphs copied correctly now! But i'm trying to figure 
> out if there's a way to copy and paste the images into the Excel, along with 
> the paragraphs as well. Do you have an idea? :)

I’m glad to hear that solution worked for you.  With that said, I only went to 
stackoverflow out of curiosity and happened ro see the posted solution. I 
probably know less about using Python to copy data from Word to Excel than you 
do, given that yesterday was the first time that I had heard about it.  

I did read that MS docx files are zip files, which can be unzipped in Python.
https://gist.github.com/another-junior-dev/990a4e622868627cb93be3d8fa2eff04 

That could provide access to the pictures contained in the document, but it 
doesn’t explain how to determine where you want to place the pictures with 
relation to text in your Excel spreadsheet.  If you could determine that, then 
you could use XlsxWriter module’s worksheet.insert_image() to insert the image. 
 See n”the Worksheet Class” in the XlsxWriter docs:
https://xlsxwriter.readthedocs.io 
The rest is beyond the realm of my knowledge.

If it were me, I would go back to stackoveflow and open a new question, as this 
is different than your original post.

Bev in TX
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to copy paragraphs (with number formatting) and images from Words (.docx) and paste into Excel (.xlsx) using Python

2020-03-22 Thread A S
On Monday, 23 March 2020 01:58:38 UTC+8, Beverly Pope  wrote:
> > On Mar 22, 2020, at 9:47 AM, A S  wrote:
> > 
> > I can't seem to paste pictures into this discussion so please see both my 
> > current and desired Excel output here:
> > 
> > https://stackoverflow.com/questions/60800494/how-to-copy-paragraphs-with-number-formatting-and-images-from-words-docx-an
> >  
> > 
> Did you try using the 2 part answer on the stackoverflow webpage?
> 
> Bev in TX

I'm able to get the paragraphs copied correctly now! But i'm trying to figure 
out if there's a way to copy and paste the images into the Excel, along with 
the paragraphs as well. Do you have an idea? :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to copy paragraphs (with number formatting) and images from Words (.docx) and paste into Excel (.xlsx) using Python

2020-03-22 Thread Beverly Pope



> On Mar 22, 2020, at 9:47 AM, A S  wrote:
> 
> I can't seem to paste pictures into this discussion so please see both my 
> current and desired Excel output here:
> 
> https://stackoverflow.com/questions/60800494/how-to-copy-paragraphs-with-number-formatting-and-images-from-words-docx-an
>  
> 
Did you try using the 2 part answer on the stackoverflow webpage?

Bev in TX
-- 
https://mail.python.org/mailman/listinfo/python-list


How to copy paragraphs (with number formatting) and images from Words (.docx) and paste into Excel (.xlsx) using Python

2020-03-22 Thread A S
I have contract clauses in Words (.docx) format that needs to be frequently 
copy and pasted into Excel (.xlsx) to be sent to the third party. The clauses 
are often updated hence there's always a need to copy and paste these clauses 
over. I only need to copy and paste all the paragraphs and images after the 
contents page. Here is a sample of the Clause document 
(https://drive.google.com/open?id=1ZzV29R6y2q0oU3HAVrqsFa158OhvpxEK).

I have tried doing up a code using Python to achieve this outcome. Here is the 
code that I have done so far:

!pip install python-docx
import docx
import xlsxwriter

document = docx.Document("Clauses Sample.docx")
wb = xlsxwriter.Workbook('C://xx//clauses sample.xlsx')

docText = []
index_row = 0
Sheet1 = wb.add_worksheet("Shee")

for paragraph in document.paragraphs:
if paragraph.text:
docText.append(paragraph.text)
xx = '\n'.join(docText)

Sheet1.write(index_row,0, xx)

index_row = index_row+1

wb.close()
#print(xx) 
However, my Excel file output looks like this:

I can't seem to paste pictures into this discussion so please see both my 
current and desired Excel output here:

https://stackoverflow.com/questions/60800494/how-to-copy-paragraphs-with-number-formatting-and-images-from-words-docx-an
-- 
https://mail.python.org/mailman/listinfo/python-list