from:"Mahmood Naderan via Python\-list"

Problem with concatenating two dataframes

2021-11-06 Thread Mahmood Naderan via Python-list

In the following code, I am trying to create some key-value pairs in a 
dictionary where the first element is a name and the second element is a 
dataframe.

# Creating a dictionary
data = {'Value':[0,0,0]}
kernel_df = pd.DataFrame(data, index=['M1','M2','M3'])
dict = {'dummy':kernel_df}
# dummy  ->  Value
#   M1  0
#   M2  0
#   M3  0


Then I read a file and create some batches and compare the name in the batch 
with the stored names in dictionary. If it doesn't exist, a new key-value (name 
and dataframe) is created. Otherwise, the Value column is appended to the 
existing dataframe.


df = pd.read_csv('test.batch.csv')
print(df)
for i in range(0, len(df), 3):
print("\n--BATCH BEGIN")
batch_df = df.iloc[i:i+3]
name = batch_df.loc[i].at["Name"]
values = batch_df.loc[:,["Value"]]
print(name)
print(values)
print("--BATCH END")
if name in dict:
# Append values to the existing key
    dict[name] = pd.concat( dict[name],values )    ERROR 
else:
# Create a new pair in dictionary
dict[name] = values;



As you can see in the output, the join statement has error.



    ID Name Metric  Value
0   0   K1 M1 10
1   0   K1 M2  5
2   0   K1 M3 10
3   1   K2 M1 20
4   1   K2 M2 10
5   1   K2 M3 15
6   2   K1 M1  2
7   2   K1 M2  2
8   2   K1 M3  2

--BATCH BEGIN
K1
   Value
0 10
1  5
2 10
--BATCH END

--BATCH BEGIN
K2
   Value
3 20
4 10
5 15
--BATCH END

--BATCH BEGIN
K1
   Value
6  2
7  2
8  2
--BATCH END




As it reaches the contact() statement, I get this error:

TypeError: first argument must be an iterable of pandas objects, you passed an 
object of type "DataFrame"


Based on the definition I wrote in the beginning of the code, "dict[name]" 
should be a dataframe. Isn't that?

How can I fix that?



Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Problem with concatenating two dataframes

2021-11-06 Thread Mahmood Naderan via Python-list

>Try this instead:
>
>
>    dict[name] = pd.concat([dict[name], values])


OK. That fixed the problem, however, I see that they are concatenated 
vertically. How can I change that to horizontal? The printed dictionary in the 
end looks like


{'dummy': Value
M1  0
M2  0
M3  0, 'K1':    Value
0 10
1  5
2 10
6  2
7  2
8  2, 'K2':    Value
3 20
4 10
5 15}



For K1, there should be three rows and two columns labeled as Value.




Regards,
Mahmood





-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Problem with concatenating two dataframes

2021-11-06 Thread Mahmood Naderan via Python-list

>The second argument of pd.concat is 'axis', which defaults to 0. Try
>using 1 instead of 0.


Unfortunately, that doesn't help...


dict[name] = pd.concat( [dict[name],values], axis=1 )



{'dummy': Value
M1  0
M2  0
M3  0, 'K1':Value  Value
0   10.0NaN
15.0NaN
2   10.0NaN
6NaN2.0
7NaN2.0
8NaN2.0, 'K2':Value
3 20
4 10
5 15}



Regards,
Mahmood



-- 
https://mail.python.org/mailman/listinfo/python-list

Returning the index of a row in dataframe

2021-11-14 Thread Mahmood Naderan via Python-list

Hi

In the following dataframe, I want to get the index string by specifying the 
row 
number which is the same as value column.


 Value
    global loads   0
    global stores  1
    local loads    2


For example, `df.iloc[1].index.name` should return "global stores" but the 
output is `None`. Any idea about that?


Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Returning the index of a row in dataframe

2021-11-14 Thread Mahmood Naderan via Python-list

>>> df.iloc[1].name


Correct I also see that 'df.index[1]' works fine.
Thanks.


Regards,
Mahmood



-- 
https://mail.python.org/mailman/listinfo/python-list

Using astype(int) for strings with thousand separator

2021-11-14 Thread Mahmood Naderan via Python-list

Hi

While reading a csv file, some cells have values like '1,024' which I mean they 
contains thousand separator ','. Therefore, when I want to process them with 

  row = df.iloc[0].astype(int)

I get the following error

  ValueError: invalid literal for int() with base 10: '1,024'


How can I fix that? Any idea?



Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Using astype(int) for strings with thousand separator

2021-11-15 Thread Mahmood Naderan via Python-list

> (see 
> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

Got it. Thanks.


Regards,
Mahmood


-- 
https://mail.python.org/mailman/listinfo/python-list

get_axes not present?

2021-11-18 Thread Mahmood Naderan via Python-list

Hi
I am using the following versions


>>> import matplotlib
>>> print(matplotlib. __version__)
3.3.4
>>> import pandas as pd
>>> print(pd.__version__)
1.2.3
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=8, micro=10, releaselevel='final', serial=0)



In my code, I use axes in Pandas plot() like this (note that I omit some 
variables in this snippet to highlight the problem):



def plot_dataframe(df, cnt, axes):
    plt.subplot(2, 1, 1)
    ax1 = row.plot( fontsize=font_size, linewidth=line_width, 
markersize=marker_size, marker='o', title='Raw values', label=cnt, ax=axes[0] )



def plot_kernels(my_dict2):
    fig,axes = plt.subplots(2,1, figsize=(20, 15))
    should_plot = plot_dataframe(df, cnt, axes=axes)
    for ax in axes:
    ax.legend()
    plt.show()



However, I get this error:


Traceback (most recent call last):
  File "process_csv.py", line 174, in 
    plot_kernels( my_dict2 )
  File "process_csv.py", line 62, in plot_kernels
    should_plot = plot_dataframe(df, cnt, axes=axes)
  File "process_csv.py", line 34, in plot_dataframe
    ax1 = row.plot( fontsize=font_size, linewidth=line_width, 
markersize=marker_size, marker='o', title='Raw values', label=cnt, ax=axes[0] )
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_core.py", 
line 955, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py",
 line 61, in plot
    plot_obj.generate()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 283, in generate
    self._adorn_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 483, in _adorn_subplots
    all_axes = self._get_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 903, in _get_subplots
    ax for ax in self.axes[0].get_figure().get_axes() if isinstance(ax, Subplot)
AttributeError: 'NoneType' object has no attribute 'get_axes'




I guess there is a mismatch between versions. Is there any workaround for that?




Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: get_axes not present?

2021-11-18 Thread Mahmood Naderan via Python-list

>It's not saying get_axes doesn't exist because of version skew, it's
>saying that the object returned by the call to the left of it
>(get_figure()) returned None, and None doesn't have methods
>
>Something isn't set up right, but you'll have to trace that through.



Do you think the following statement is correct?

ax1 = row.plot( fontsize=font_size, 
    linewidth=line_width, 
    markersize=marker_size, 
    marker='o', 
    title='Raw values', 
    label=cnt, 
    ax=axes[0] )
ax1.set_ylabel( yax_label, fontsize=font_size )


As you can see I put the result of plot() to ax1 and then use some functions, 
e.g. set_ylabel().

On the other hand, I have specified `label` and `ax` in plot(), too.



Regards,
Mahmood

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: get_axes not present?

2021-11-19 Thread Mahmood Naderan via Python-list

>And what is the result of plot()?  Is it a valid object, or is it None?

Well the error happens on the plot() line. I tried to print some information 
like this:


    print("axes=", axes)
    print("axes[0]=", axes[0])
    print("cnt=", cnt)
    print("row=", row)
    ax1 = row.plot( fontsize=font_size, linewidth=line_width, 
markersize=marker_size, marker='o', title='Raw values', label=cnt, ax=axes[0] )



The output looks like


axes= [ ]
axes[0]= AxesSubplot(0.125,0.53;0.775x0.35)
cnt= 1
row= 1   278528
2   278528
3   278528
4   278528
5   278528
 ...
5604    278528
5605    278528
5606    278528
5607    278528
5608    278528
Name: 4, Length: 5608, dtype: int64
Traceback (most recent call last):
  File "process_csv.py", line 178, in 
    plot_kernels( my_dict2 )
  File "process_csv.py", line 66, in plot_kernels
    should_plot = plot_dataframe(df, cnt, axes)
  File "process_csv.py", line 38, in plot_dataframe
    ax1 = row.plot( fontsize=font_size, linewidth=line_width, 
markersize=marker_size, marker='o', title='Raw values', label=cnt, ax=axes[0] )
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_core.py", 
line 955, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py",
 line 61, in plot
    plot_obj.generate()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 283, in generate
    self._adorn_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 483, in _adorn_subplots
    all_axes = self._get_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 903, in _get_subplots
    ax for ax in self.axes[0].get_figure().get_axes() if isinstance(ax, Subplot)
AttributeError: 'NoneType' object has no attribute 'get_axes'



The error is weird. I have stick at this error...
Any thoughts on that?


Regards,
Mahmood



-- 
https://mail.python.org/mailman/listinfo/python-list

Re: get_axes not present?

2021-11-21 Thread Mahmood Naderan via Python-list

>The best way to get
>assistance here on the list is to create a minimal, self-contained,
>run-able, example program that you can post in its entirety here that
>demonstrates the issue.


I created a sample code with input. Since the code processes a csv file to 
group input rows, I also included those in this minimal code but those 
preprocesses are not buggy. In this sample code, I used print() to print 
necessary information. The error exists in the plot function. I tested the 
dictionary build before that and it is fine.


Code is available at https://pastebin.com/giAnjJDV  and the input file 
(test.batch.csv) is available https://pastebin.com/Hdp4Wt9B 

The run command is "python3 test.py". With the versions specified in my system, 
here is the full output:





$ python3 test.py
Reading file...
matplotlib version =  3.3.4
pandas version =  1.2.3
sys version sys.version_info(major=3, minor=8, micro=10, releaselevel='final', 
serial=0)
Original dictionary =  {'dummy': Value
M1  0
M2  0
M3  0, 'K1::foo(bar::z(x,u))':    Value  Value
0 10  2
1  5  2
2 10  2, 'K2::foo()':    Value
0 20
1 10
2 15, 'K3::foo(baar::y(z,u))':    Value
0 12
1 13
2 14, 'K3::foo(bar::y(z,u))':    Value
0  6
1  7
2  8}
New dictionary for plot =  {'dummy': Value
M1  0
M2  0
M3  0, 'K1::foo(bar::z(x,u))':    Value  Value
0 10  2
1  5  2
2 10  2, 'K3::foo(bar::y(z,u))':    Value
0  6
1  7
2  8}
Key is  K1::foo(bar::z(x,u))  -> df is Value  Value
0 10  2
1  5  2
2 10  2
axes= [ ]
axes[0]= AxesSubplot(0.125,0.53;0.775x0.35)
cnt= 1
row= 1    10
2 2
Name: 0, dtype: int64
Traceback (most recent call last):
  File "test.py", line 74, in 
    plot_kernels(my_dict2)
  File "test.py", line 52, in plot_kernels
    plot_dataframe(df, cnt, axes)
  File "test.py", line 36, in plot_dataframe
    ax1 = row.plot(label=cnt, ax=axes[0], marker='o')   # Line chart
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_core.py", 
line 955, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py",
 line 61, in plot
    plot_obj.generate()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 283, in generate
    self._adorn_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 483, in _adorn_subplots
    all_axes = self._get_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 903, in _get_subplots
    ax for ax in self.axes[0].get_figure().get_axes() if isinstance(ax, Subplot)
AttributeError: 'NoneType' object has no attribute 'get_axes'



I am pretty sure that there is a version mismatch because on a system with 
Pandas 1.3.3 the output should be like https://imgur.com/a/LZ9eAzl

Any feedback is appreciated.


Regards,
Mahmood

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: get_axes not present?

2021-11-21 Thread Mahmood Naderan via Python-list

>I installed the latest pandas, although on Python 3.10, and the script
>worked without a problem.


Yes as I wrote it works with 1.3.3 but mine is 1.2.3.
I am trying to keep the current version because of the possible future 
consequences. In the end maybe I have to upgrade the pandas.


Regards,
Mahmood



-- 
https://mail.python.org/mailman/listinfo/python-list

Re: get_axes not present?

2021-11-21 Thread Mahmood Naderan via Python-list


>Your example isn't minimal enough for me to be able to pin it down any
>better than that, though.

Chris,
I was able to simply it even further. Please look at this:


$ cat test.batch.csv
Value,Value
10,2
5,2
10,2


$ cat test.py
import pandas as pd
import csv,sys
import matplotlib
import matplotlib.pyplot as plt

df = pd.read_csv('test.batch.csv')
print(df)

def plot_dataframe(df, cnt, axes):
    df.columns = range(1, len(df.columns)+1)   # Ignore the column header
    row = df.iloc[0].astype(int)  # First row in the dataframe
    plt.subplot(2, 1, 1)
    print("axes=", axes)
    print("axes[0]=", axes[0])
    print("cnt=", cnt)
    print("row=", row)
    ax1 = row.plot(label=cnt, ax=axes[0], marker='o')   # Line chart
    ax1.set_ylabel( 'test', fontsize=15 )
    plt.subplot(2, 1, 2)
    df2 = row.value_counts()
    df2.reindex().plot(kind='bar', label=cnt, ax=axes[1])   # Histogram

def plot_kernels(df):
    fig,axes = plt.subplots(2,1, figsize=(20, 15))
    cnt=1
    plot_dataframe(df, cnt, axes)
    cnt = cnt + 1
    for ax in axes:
    ax.legend()
    plt.show()

print("matplotlib version = ",  matplotlib.__version__)
print("pandas version = ", pd.__version__)
print("sys version", sys.version_info)

plot_kernels(df)




And the output is


$ python3 test.py
   Value  Value.1
0 10    2
1  5    2
2 10    2
matplotlib version =  3.3.4
pandas version =  1.2.3
sys version sys.version_info(major=3, minor=8, micro=10, releaselevel='final', 
serial=0)
axes= [ ]
axes[0]= AxesSubplot(0.125,0.53;0.775x0.35)
cnt= 1
row= 1    10
2 2
Name: 0, dtype: int64
Traceback (most recent call last):
  File "test.py", line 41, in 
    plot_kernels(df)
  File "test.py", line 29, in plot_kernels
    plot_dataframe(df, cnt, axes)
  File "test.py", line 19, in plot_dataframe
    ax1 = row.plot(label=cnt, ax=axes[0], marker='o')   # Line chart
  File 
"/home/mnaderan/.local/lib/python3.8/site-packages/pandas/plotting/_core.py", 
line 955, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File 
"/home/mnaderan/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py",
 line 61, in plot
    plot_obj.generate()
  File 
"/home/mnaderan/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 283, in generate
    self._adorn_subplots()
  File 
"/home/mnaderan/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 483, in _adorn_subplots
    all_axes = self._get_subplots()
  File 
"/home/mnaderan/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 903, in _get_subplots
    ax for ax in self.axes[0].get_figure().get_axes() if isinstance(ax, Subplot)
AttributeError: 'NoneType' object has no attribute 'get_axes'



Any idea about that?



Regards,
Mahmood

-- 
https://mail.python.org/mailman/listinfo/python-list

About get_axes() in Pandas 1.2.3

2021-11-22 Thread Mahmood Naderan via Python-list

Hi

I asked a question some days ago, but due to the lack of minimal producing 
code, the topic got a bit messy. So, I have decided to ask it in a new topic 
with a clear minimum code.

With Pandas 1.2.3 and Matplotlib 3.3.4, the following plot() functions returns 
error and I don't know what is wrong with that.



import pandas as pd
import csv,sys
import matplotlib
import matplotlib.pyplot as plt

df = pd.read_csv('test.batch.csv')
print(df)

print("matplotlib version = ",  matplotlib.__version__)
print("pandas version = ", pd.__version__)
print("sys version", sys.version_info)

fig,axes = plt.subplots(2,1, figsize=(20, 15))
df.columns = range(1, len(df.columns)+1)   # Ignore the column header
row = df.iloc[0].astype(int)  # First row in the dataframe
plt.subplot(2, 1, 1)
print("axes=", axes)
print("axes[0]=", axes[0])
print("row=", row)
ax1 = row.plot(ax=axes[0])   # Line chart <-- ERROR
ax1.set_ylabel( 'test' )
plt.subplot(2, 1, 2)
df2 = row.value_counts()
df2.reindex().plot(kind='bar', ax=axes[1])   # Histogram

plt.show()




The output is



$ cat test.batch.csv
Value,Value
10,2
5,2
10,2

$ python3 test.py
   Value  Value.1
0 102
1  52
2 102
matplotlib version =  3.3.4
pandas version =  1.2.3
sys version sys.version_info(major=3, minor=8, micro=10, releaselevel='final', 
serial=0)
axes= [ ]
axes[0]= AxesSubplot(0.125,0.53;0.775x0.35)
row= 110
2 2
Name: 0, dtype: int64
Traceback (most recent call last):
  File "test.py", line 20, in 
ax1 = row.plot(ax=axes[0])   # Line chart
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_core.py", 
line 955, in __call__
return plot_backend.plot(data, kind=kind, **kwargs)
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py",
 line 61, in plot
plot_obj.generate()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 283, in generate
self._adorn_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 483, in _adorn_subplots
all_axes = self._get_subplots()
  File 
"/home/mahmood/.local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py",
 line 903, in _get_subplots
ax for ax in self.axes[0].get_figure().get_axes() if isinstance(ax, Subplot)
AttributeError: 'NoneType' object has no attribute 'get_axes'



Although the plot() crashes, I see that row and axes variables are valid. So, I 
wonder what is the workaround for this code without upgrading  Pandas or 
Matplotlib. Any idea?



Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: About get_axes() in Pandas 1.2.3

2021-11-22 Thread Mahmood Naderan via Python-list

>I can help you narrow it down a bit. The problem actually occurs inside
>this function call somehow. You can verify this by doing this:
>
>
>fig,axes = plt.subplots(2,1, figsize=(20, 15))
>
>print ("axes[0].get_figure()=",axes[0].get_figure())
>
>You'll find that get_figure() is returning None, when it should be
>returning Figure(2000x1500). So plt.subplots is not doing something
>properly which was corrected at some point. Oddly enough, with pandas
>1.1.4 and matplotlib 3.2.2 (which is what my system has by default),
>there is no error, although the graph is blank.
>
>In my venv, when I upgrade matplotlib from 3.3.4 to 3.5, the problem
>also goes away.  3.4.0 also works.
>
>Honestly your solution is going to be to provide a virtual environment
>with your script.  That way you can bundle the appropriate dependencies
>without modifying anything on the host system.



Thanks for the feedback. You are right.
I agree that virtualenv is the most safest method at this time.


Regards,
Mahmood


-- 
https://mail.python.org/mailman/listinfo/python-list

Extracting dataframe column with multiple conditions on row values

2022-01-07 Thread Mahmood Naderan via Python-list

   Hi



   I have a csv file like this



   V0,V1,V2,V3

   4,1,1,1

   6,4,5,2

   2,3,6,7





   And I want to search two rows for a match and find the column. For
   example, I want to search row[0] for 1 and row[1] for 5. The corresponding
   column is V2 (which is the third column). Then I want to return the value
   at row[2] and the found column. The result should be 6 then.



   I can manually extract the specified rows (with index 0 and 1 which are
   fixed) and manually iterate over them like arrays to find a match. Then I



   key1 = 1

   key2 = 5

   row1 = df.iloc[0]  # row=[4,1,1,1]

   row2 = df.iloc[1]  # row=[6,4,5,2]

   for i in range(len(row1)):

   if row1[i] == key1:

   for j in range(len(row2)):

if row2[j] == key2:

res = df.iloc[:,j]

print(res)# 6





   Is there any way to use built-in function for a more efficient code?





   Regards,

   Mahmood


-- 
https://mail.python.org/mailman/listinfo/python-list

Writing a string with comma in one column of CSV file

2022-01-15 Thread Mahmood Naderan via Python-list

Hi,
I use the following line to write some information to a CSV file which is comma 
delimited.

f = open(output_file, 'w', newline='')
wr = csv.writer(f)
...
f.write(str(n) + "," + str(key) + "\n" )


Problem is that key is a string which may contain ',' and this causes the final 
CSV file to have more than 2 columns, while I want to write the whole key as a 
single column.

I know that wr.writerow([key]) writes the entire key in one column, but I would 
like to do the same with write(). Any idea to fix that?


Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Writing a string with comma in one column of CSV file

2022-01-15 Thread Mahmood Naderan via Python-list

Right. I was also able to put all columns in a string and then use writerow().
Thanks.



Regards,
Mahmood






On Saturday, January 15, 2022, 10:33:08 PM GMT+1, alister via Python-list 
 wrote: 





On Sat, 15 Jan 2022 20:56:22 + (UTC), Mahmood Naderan wrote:

> Hi,
> I use the following line to write some information to a CSV file which
> is comma delimited.
> 
> f = open(output_file, 'w', newline='')
> wr = csv.writer(f)
> ...
> f.write(str(n) + "," + str(key) + "\n" )
> 
> 
> Problem is that key is a string which may contain ',' and this causes
> the final CSV file to have more than 2 columns, while I want to write
> the whole key as a single column.
> 
> I know that wr.writerow([key]) writes the entire key in one column, but
> I would like to do the same with write(). Any idea to fix that?
> 
> 
> Regards,
> Mahmood

you need to quote the data
the easies way to ensure this is to inculde to QUOTE_ALL option when 
opening the file

wr = csv.writer(output, quoting=csv.QUOTE_ALL)



-- 
Chocolate chip.
-- 
https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Unable to install "collect" via pip3

2019-12-20 Thread Mahmood Naderan via Python-list

Hi
 
I can install collect with pip for python2.7
$ pip install --user collect 
 Collecting collect
  Using cached 
https://files.pythonhosted.org/packages/cf/5e/c0f0f51d081665374a2c219ea4ba23fb1e179b70dded96dc16606786d828/collect-0.1.1.tar.gz
Collecting couchdbkit>=0.5.7 (from collect)
  Using cached 
https://files.pythonhosted.org/packages/a1/13/9e9ff695a385c44f62b4766341b97f2bd8b596962df2a0beabf358468b70/couchdbkit-0.6.5.tar.gz
Collecting restkit>=4.2.2 (from couchdbkit>=0.5.7->collect)
  Downloading 
https://files.pythonhosted.org/packages/76/b9/d90120add1be718f853c53008cf5b62d74abad1d32bd1e7097dd913ae053/restkit-4.2.2.tar.gz
 (1.3MB)
100% || 1.3MB 633kB/s
Collecting http-parser>=0.8.3 (from restkit>=4.2.2->couchdbkit>=0.5.7->collect)
  Downloading 
https://files.pythonhosted.org/packages/07/c4/22e3c76c2313c26dd5f84f1205b916ff38ea951aab0c4544b6e2f5920d64/http-parser-0.8.3.tar.gz
 (83kB)
100% || 92kB 2.4MB/s
Collecting socketpool>=0.5.3 (from restkit>=4.2.2->couchdbkit>=0.5.7->collect)
  Downloading 
https://files.pythonhosted.org/packages/d1/39/fae99a735227234ffec389b252c6de2bc7816bf627f56b4c558dc46c85aa/socketpool-0.5.3.tar.gz
Building wheels for collected packages: collect, couchdbkit, restkit, 
http-parser, socketpool
  Running setup.py bdist_wheel for collect ... done
  Stored in directory: 
/home/mnaderan/.cache/pip/wheels/b9/7c/7c/b09b334cc0e27b4f63ee9f6f19ca1f3db8672666a7e0f3d9cd
  Running setup.py bdist_wheel for couchdbkit ... done
  Stored in directory: 
/home/mnaderan/.cache/pip/wheels/f6/05/1b/f8f576ef18564bc68ab6e64f405e1263448036208cafb221e0
  Running setup.py bdist_wheel for restkit ... done
  Stored in directory: 
/home/mnaderan/.cache/pip/wheels/48/c5/32/d0d25fb272791a68c49c26150f332d9b9492d0bc9ea0cdd2c7
  Running setup.py bdist_wheel for http-parser ... done
  Stored in directory: 
/home/mnaderan/.cache/pip/wheels/22/db/06/cb609a3345e7aa87206de160f00cc6af364650d1139d904a25
  Running setup.py bdist_wheel for socketpool ... done
  Stored in directory: 
/home/mnaderan/.cache/pip/wheels/93/f6/8c/65924848766618647078cb66b1d964e8b80876536e84517469
Successfully built collect couchdbkit restkit http-parser socketpool
Installing collected packages: http-parser, socketpool, restkit, couchdbkit, 
collect
Successfully installed collect-0.1.1 couchdbkit-0.6.5 http-parser-0.8.3 
restkit-4.2.2 socketpool-0.5.3
However, pip3 fails with this error
$ pip3 install --user collect
Collecting collect
  Using cached 
https://files.pythonhosted.org/packages/cf/5e/c0f0f51d081665374a2c219ea4ba23fb1e179b70dded96dc16606786d828/collect-0.1.1.tar.gz
Collecting couchdbkit>=0.5.7 (from collect)
  Using cached 
https://files.pythonhosted.org/packages/a1/13/9e9ff695a385c44f62b4766341b97f2bd8b596962df2a0beabf358468b70/couchdbkit-0.6.5.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
  File "", line 1, in 
  File "/tmp/pip-build-qf95n0tt/couchdbkit/setup.py", line 25, in 
long_description = file(
NameError: name 'file' is not defined


Command "python setup.py egg_info" failed with error code 1 in 
/tmp/pip-build-qf95n0tt/couchdbkit/
I can not figure out what is the problem. Any way to fix that?

More info:
$ which python
/usr/bin/python
$ ls -l /usr/bin/python
lrwxrwxrwx 1 root root 9 Apr 16  2018 /usr/bin/python -> python2.7
$ which python3
/usr/bin/python3
$ ls -l /usr/bin/python3
lrwxrwxrwx 1 root root 9 Jun 21  2018 /usr/bin/python3 -> python3.6 



Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Unable to install "collect" via pip3

2019-12-22 Thread Mahmood Naderan via Python-list

Yes thank you. The package is not compatible with 3.x.

Regards,
Mahmood 

On Saturday, December 21, 2019, 1:40:29 AM GMT+3:30, Barry 
 wrote:  
 
 

> On 20 Dec 2019, at 15:27, Mahmood Naderan via Python-list 
>  wrote:
> 
> Hi
> 
> I can install collect with pip for python2.7
> $ pip install --user collect                                        
> Collecting collect
>  Using cached 
>https://files.pythonhosted.org/packages/cf/5e/c0f0f51d081665374a2c219ea4ba23fb1e179b70dded96dc16606786d828/collect-0.1.1.tar.gz
> Collecting couchdbkit>=0.5.7 (from collect)
>  Using cached 
>https://files.pythonhosted.org/packages/a1/13/9e9ff695a385c44f62b4766341b97f2bd8b596962df2a0beabf358468b70/couchdbkit-0.6.5.tar.gz
> Collecting restkit>=4.2.2 (from couchdbkit>=0.5.7->collect)
>  Downloading 
>https://files.pythonhosted.org/packages/76/b9/d90120add1be718f853c53008cf5b62d74abad1d32bd1e7097dd913ae053/restkit-4.2.2.tar.gz
> (1.3MB)
>    100% || 1.3MB 633kB/s
> Collecting http-parser>=0.8.3 (from 
> restkit>=4.2.2->couchdbkit>=0.5.7->collect)
>  Downloading 
>https://files.pythonhosted.org/packages/07/c4/22e3c76c2313c26dd5f84f1205b916ff38ea951aab0c4544b6e2f5920d64/http-parser-0.8.3.tar.gz
> (83kB)
>    100% || 92kB 2.4MB/s
> Collecting socketpool>=0.5.3 (from restkit>=4.2.2->couchdbkit>=0.5.7->collect)
>  Downloading 
>https://files.pythonhosted.org/packages/d1/39/fae99a735227234ffec389b252c6de2bc7816bf627f56b4c558dc46c85aa/socketpool-0.5.3.tar.gz
> Building wheels for collected packages: collect, couchdbkit, restkit, 
> http-parser, socketpool
>  Running setup.py bdist_wheel for collect ... done
>  Stored in directory: 
>/home/mnaderan/.cache/pip/wheels/b9/7c/7c/b09b334cc0e27b4f63ee9f6f19ca1f3db8672666a7e0f3d9cd
>  Running setup.py bdist_wheel for couchdbkit ... done
>  Stored in directory: 
>/home/mnaderan/.cache/pip/wheels/f6/05/1b/f8f576ef18564bc68ab6e64f405e1263448036208cafb221e0
>  Running setup.py bdist_wheel for restkit ... done
>  Stored in directory: 
>/home/mnaderan/.cache/pip/wheels/48/c5/32/d0d25fb272791a68c49c26150f332d9b9492d0bc9ea0cdd2c7
>  Running setup.py bdist_wheel for http-parser ... done
>  Stored in directory: 
>/home/mnaderan/.cache/pip/wheels/22/db/06/cb609a3345e7aa87206de160f00cc6af364650d1139d904a25
>  Running setup.py bdist_wheel for socketpool ... done
>  Stored in directory: 
>/home/mnaderan/.cache/pip/wheels/93/f6/8c/65924848766618647078cb66b1d964e8b80876536e84517469
> Successfully built collect couchdbkit restkit http-parser socketpool
> Installing collected packages: http-parser, socketpool, restkit, couchdbkit, 
> collect
> Successfully installed collect-0.1.1 couchdbkit-0.6.5 http-parser-0.8.3 
> restkit-4.2.2 socketpool-0.5.3
> However, pip3 fails with this error
> $ pip3 install --user collect
> Collecting collect
>  Using cached 
>https://files.pythonhosted.org/packages/cf/5e/c0f0f51d081665374a2c219ea4ba23fb1e179b70dded96dc16606786d828/collect-0.1.1.tar.gz
> Collecting couchdbkit>=0.5.7 (from collect)
>  Using cached 
>https://files.pythonhosted.org/packages/a1/13/9e9ff695a385c44f62b4766341b97f2bd8b596962df2a0beabf358468b70/couchdbkit-0.6.5.tar.gz
>    Complete output from command python setup.py egg_info:
>    Traceback (most recent call last):
>      File "", line 1, in 
>      File "/tmp/pip-build-qf95n0tt/couchdbkit/setup.py", line 25, in 
>        long_description = file(
>    NameError: name 'file' is not defined

My guess is that file is python 2 only. Couchdbkit needs porting to python 3.

Barry

> 
>    
> Command "python setup.py egg_info" failed with error code 1 in 
> /tmp/pip-build-qf95n0tt/couchdbkit/
> I can not figure out what is the problem. Any way to fix that?
> 
> More info:
> $ which python
> /usr/bin/python
> $ ls -l /usr/bin/python
> lrwxrwxrwx 1 root root 9 Apr 16  2018 /usr/bin/python -> python2.7
> $ which python3
> /usr/bin/python3
> $ ls -l /usr/bin/python3
> lrwxrwxrwx 1 root root 9 Jun 21  2018 /usr/bin/python3 -> python3.6 
> 
> 
> 
> Regards,
> Mahmood
> -- 
> https://mail.python.org/mailman/listinfo/python-list
  
-- 
https://mail.python.org/mailman/listinfo/python-list

Grepping words for match in a file

2019-12-28 Thread Mahmood Naderan via Python-list

Hi
I have some lines in a text file like
ADD R1, R2
ADD3 R4, R5, R6
ADD.MOV R1, R2, [0x10]
If I grep words with this code
for line in fp:
if my_word in line:
Then if my_word is "ADD", I get 3 matches. However, if I grep word with this 
code
for line in fp:
for word in line.split():
if my_word == word:
Then I get only one match which is ADD R1. R2.
Actually I want to get 2 matches. ADD R1, R2 and ADD.MOV R1, R2, [0x10] because 
these two lines are actually "ADD" instructions. However, "ADD3" is something 
else.
How can I fix the code for that purpose?
--
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Install python via MS batch file

2017-05-06 Thread Mahmood Naderan via Python-list

Hello,
I have downloaded python-3.6.1-amd64.exe and it is fine to install it through 
GUI. However, I want to write a batch file to install it via command line. 
Since the installation process is interactive, it seems that the auto-install 
batch file is difficult. What I want to do is:

set python path
install in the default location.

Any idea about that?


 Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

packaging python code

2017-05-08 Thread Mahmood Naderan via Python-list

Hi,
I have simple piece of code which uses two libraries (numpy and openpyxl). The 
script is called from another application. Currently, if someone wants to run 
my program, he has to first install the python completely via its installer.

Is there any way to pack my .py with all required libraries and create a self 
running package? Something like building exe file with static libraries. 
Therefore, the user won't install any thing manually.

Please let me know if there is such procedure.

 Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: packaging python code

2017-05-08 Thread Mahmood Naderan via Python-list

OK. I did that but it fails! Please see the stack 

D:\ThinkPad\Documents\NetBeansProjects\ExcelTest>pyinstaller exread.py
96 INFO: PyInstaller: 3.2.1
96 INFO: Python: 3.6.1
98 INFO: Platform: Windows-10-10.0.14393-SP0
103 INFO: wrote D:\ThinkPad\Documents\NetBeansProjects\ExcelTest\exread.spec
109 INFO: UPX is not available.
111 INFO: Extending PYTHONPATH with paths
['D:\\ThinkPad\\Documents\\NetBeansProjects\\ExcelTest',
'D:\\ThinkPad\\Documents\\NetBeansProjects\\ExcelTest']
112 INFO: checking Analysis
113 INFO: Building Analysis because out00-Analysis.toc is non existent
113 INFO: Initializing module dependency graph...
117 INFO: Initializing module graph hooks...
119 INFO: Analyzing base_library.zip ...
Traceback (most recent call last):
File 
"C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\Scripts\pyinstaller-script.py",
 line 11, in 
load_entry_point('PyInstaller==3.2.1', 'console_scripts', 'pyinstaller')()
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\__main__.py",
 line 90, in run
run_build(pyi_config, spec_file, **vars(args))
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\__main__.py",
 line 46, in run_build
PyInstaller.building.build_main.main(pyi_config, spec_file, **kwargs)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\building\build_main.py",
 line 788, in main
build(specfile, kw.get('distpath'), kw.get('workpath'), kw.get('clean_build'))
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\building\build_main.py",
 line 734, in build
exec(text, spec_namespace)
File "", line 16, in 
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\building\build_main.py",
 line 212, in __init__
self.__postinit__()
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\building\datastruct.py",
 line 161, in __postinit__
self.assemble()
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\building\build_main.py",
 line 317, in assemble
excludes=self.excludes, user_hook_dirs=self.hookspath)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\depend\analysis.py",
 line 560, in initialize_modgraph
graph.import_hook(m)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 1509, in import_hook
source_package, target_module_partname, level)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 1661, in _find_head_package
target_module_headname, target_package_name, source_package)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\depend\analysis.py",
 line 209, in _safe_import_module
module_basename, module_name, parent_package)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 2077, in _safe_import_module
module_name, file_handle, pathname, metadata)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 2167, in _load_module
self._scan_code(m, co, co_ast)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 2585, in _scan_code
module, module_code_object, is_scanning_imports=False)
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 2831, in _scan_bytecode
global_attr_name = get_operation_arg_name()
File 
"c:\users\thinkpad\appdata\local\programs\python\python36\lib\site-packages\PyInstaller\lib\modulegraph\modulegraph.py",
 line 2731, in get_operation_arg_name
return module_code_object.co_names[co_names_index]
IndexError: tuple index out of range

D:\ThinkPad\Documents\NetBeansProjects\ExcelTest>

Regards,
Mahmood

On Monday, May 8, 2017 5:07 PM, Lutz Horn  wrote:

> Is there any way to pack my .py with all required libraries and create a self 
> running package?

Take a look at PyInstaller:

* http://www.pyinstaller.org/
* https://pyinstaller.readthedocs.io/en/stable/

Lutz
-- 
https://mail.python.org/mailman/listinfo/python-list

Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Hello,

The following code which uses openpyxl and numpy, fails to read large Excel 
(xlsx) files. The file si 20Mb which contains 100K rows and 50 columns.



W = load_workbook(fname, read_only = True)

p = W.worksheets[0]

a=[]

m = p.max_row

n = p.max_column


np.array([[i.value for i in j] for j in p.rows])



How can I fix that? I have stuck at this problem. For medium sized files (16K 
rows and 50 columns) it is fine.

Regards,

Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Thanks for your reply. The openpyxl part (reading the workbook) works fine. I 
printed some debug information and found that when it reaches the np.array, 
after some 10 seconds, the memory usage goes high. 


So, I think numpy is unable to manage the memory.


 
Regards,
Mahmood


On Wednesday, May 10, 2017 7:25 PM, Peter Otten <[email protected]> wrote:



Mahmood Naderan via Python-list wrote:


> Hello,

> 

> The following code which uses openpyxl and numpy, fails to read large

> Excel (xlsx) files. The file si 20Mb which contains 100K rows and 50

> columns.

> 

> 

> 

> W = load_workbook(fname, read_only = True)

> 

> p = W.worksheets[0]

> 

> a=[]

> 

> m = p.max_row

> 

> n = p.max_column

> 

> 

> np.array([[i.value for i in j] for j in p.rows])

> 

> 

> 

> How can I fix that? I have stuck at this problem. For medium sized files

> (16K rows and 50 columns) it is fine.


The docs at


https://openpyxl.readthedocs.io/en/default/optimized.html#read-only-mode


promise "(near) constant memory consumption" for the sample script below:


from openpyxl import load_workbook

wb = load_workbook(filename='large_file.xlsx', read_only=True)

ws = wb['big_data']


for row in ws.rows:

for cell in row:

print(cell.value)


If you change only the file and worksheet name to your needs -- does the 

script run to completion in reasonable time (redirect stdout to /dev/null) 

and with reasonable memory usage?


If it does you may be wasting memory elsewhere; otherwise you might need to 

convert the xlsx file to csv using your spreadsheet application before 

processing the data in Python.


-- 

https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Well actually cells are treated as strings and not integer or float numbers.

One way to overcome is to get the number of rows and then split it to 4 or 5 
arrays and then process them. However, i was looking for a better solution. 

I read in pages that large excels are in the order of milion rows. Mine is 
about 100k. Currently, the task manager shows about 4GB of ram usage while 
working with numpy.

Regards,
Mahmood

On Wed, 5/10/17, Peter Otten <[email protected]> wrote:

 Subject: Re: Out of memory while reading excel file
 To: [email protected]
 Date: Wednesday, May 10, 2017, 3:48 PM

 Mahmood Naderan via Python-list wrote:

 > Thanks for your reply. The
 openpyxl part (reading the workbook) works
 > fine. I printed some debug
 information and found that when it reaches the
 > np.array, after some 10 seconds,
 the memory usage goes high.
 > 
 > 
 > So, I think numpy is unable to
 manage the memory.

 Hm, I think numpy is designed to manage
 huge arrays if you have enough RAM.

 Anyway: are all values of the same
 type? Then the numpy array may be kept 
 much smaller than in the general case
 (I think). You can also avoid the 
 intermediate list of lists:

 wb =
 load_workbook(filename='beta.xlsx', read_only=True)
 ws = wb['alpha']

 a = numpy.zeros((ws.max_row,
 ws.max_column), dtype=float)
 for y, row in enumerate(ws.rows):
     a[y] = [cell.value for
 cell in row]

 -- 
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Hi
I will try your code... meanwhile I have to say, as you pointed earlier and as 
stated in the documents, numpy is designed to handle large arrays and that is 
the reason I chose that. If there is a better option, please let me know.

Regards,
Mahmood

On Wed, 5/10/17, Peter Otten <[email protected]> wrote:

 Subject: Re: Out of memory while reading excel file
 To: [email protected]
 Date: Wednesday, May 10, 2017, 6:30 PM

 Mahmood Naderan via Python-list wrote:

 > Well actually cells are treated as
 strings and not integer or float
 > numbers.

 May I ask why you are using numpy when
 you are dealing with strings? If you 
 provide a few details about what you
 are trying to achieve someone may be 
 able to suggest a workable approach.

 Back-of-the-envelope considerations:
 4GB / 5E6 cells amounts to

 >>> 2**32 / (10 * 50)
 858.9934592

 about 850 bytes per cell, with an
 overhead of

 >>> sys.getsizeof("")
 49

 that would be 800 ascii chars, down to
 200 chars in the worst case. If your 
 strings are much smaller the problem
 lies elsewhere.

 -- 
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Hi,
I am confused with that. If you say that numpy is not suitable for my case and 
may have large overhead, what is the alternative then? Do you mean that numpy 
is a good choice here while we can reduce its overhead?

 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

>a = numpy.zeros((ws.max_row, ws.max_column), dtype=float)
>for y, row in enumerate(ws.rows):
>   a[y] = [cell.value for cell in row]



Peter, 

As I used this code, it gave me an error that cannot convert string to float 
for the first cell. All cells are strings.

 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-10 Thread Mahmood Naderan via Python-list

Hi,
I used the old fashion coding style to create a matrix and read/add the cells.

W = load_workbook(fname, read_only = True)
p = W.worksheets[0]
m = p.max_row
n = p.max_column
arr = np.empty((m, n), dtype=object)
for r in range(1, m):
for c in range(1, n):
  d = p.cell(row=r, column=c)
  arr[r, c] = d.value


However, the operation is very slow. I printed row number to see how things are 
going. It took 2 minutes to add 200 rows and about 10 minutes to add the next 
200 rows. 
 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list

I wrote this:

a = np.zeros((p.max_row, p.max_column), dtype=object)
for y, row in enumerate(p.rows):
  for cell in row:
print (cell.value)
a[y] = cell.value 
 print (a[y])


For one of the cells, I see

NM_198576.3
['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3']

 
These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my excel 
file (p.max_column)



The excel file looks like

CHR1 11,202,100 NM_198576.3 PASS 3.08932G|B|C -.   
.   .



Note that in each row, some cells are '-' or '.' only. I want to read all cells 
as string. Then I will write the matrix in a file and my main code (java) will 
process that. I chose openpyxl for reading excel files, because Apache POI (a 
java package for manipulating excel files) consumes huge memory even for medium 
files.

So my python script only transforms an xlsx file to a txt file keeping the cell 
positions and formats.

Any suggestion?

Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list

Thanks. That code is so simple and works. However, there are things to be 
considered. With the CSV format, cells in a row are separated by ',' and for 
some cells it writes "" around the cell content.

So, if the excel looks like 


CHR1  11,232,445


The output file looks like

CHR1,"11,232,445"


Is it possible to use  as the delimiting character and omit ""? I say 
that because, my java code which has to read the output file has to do some 
extra works (using space as delimiter is the default and much easier to work). 
I want

a[0][0] = CHR
a[0][1] = 11,232,445

And both are strings. Is that possible?
 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list

Excuse me, I changed 

csv.writer(outstream)

to 

csv.writer(outstream, delimiter =' ')


It puts space between cells and omits "" around some content. However, between 
two lines there is a new empty line. In other word, the first line is the first 
row of excel file. The second line is empty ("\n") and the third line is the 
second row of the excel file.

Any thought?
 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Out of memory while reading excel file

2017-05-11 Thread Mahmood Naderan via Python-list

Thanks a lot for suggestions. It is now solved.

 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Concatenating files in order

2017-05-23 Thread Mahmood Naderan via Python-list

Hi,
There are some text files ending with _chunk_i where 'i' is an integer. For 
example,

XXX_chunk_0
XXX_chunk_1
...

I want to concatenate them in order. Thing is that the total number of files 
may be variable. Therefore, I can not specify the number in my python script. 
It has to be "for all files ending with _chunk_i".

Next, I can write

with open('final.txt', 'w') as outf:
for fname in filenames:
with open(fname) as inf:
for line in inf:
outf.write(line)
 

How can I specify the "filenames"?
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Concatenating files in order

2017-05-23 Thread Mahmood Naderan via Python-list

>Yup.  Make a list of all the file names, write a key function that 
>extracts the numbery bits, sort the list based on that key function, and 
>go to town.
>
>Alternatively, when you create the files in the first place, make sure 
>to use more leading zeros than you could possibly need. 
>xxx_chunk_01 sorts less than xxx_chunk_10.




So, if I write

import glob;
for f in glob.glob('*chunk*'):
  print(f)

it will print in order. Is that really sorted or it is not guaranteed?

Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Concatenating files in order

2017-05-23 Thread Mahmood Naderan via Python-list

OK guys thank you very much. It is better to sort them first.

Here is what I wrote

files =  glob.glob('*chunk*')
sorted=[[int(name.split("_")[-1]), name] for name in files]
with open('final.txt', 'w') as outf:
  for fname in sorted:
with open(fname[1]) as inf:
   for line in inf:
 outf.write(line)

and it works
Regards,
Mahmood

On Wednesday, May 24, 2017 1:20 AM, bartc  wrote:

On 23/05/2017 20:55, Rob Gaddi wrote:

> Yup.  Make a list of all the file names, write a key function that
> extracts the numbery bits, sort the list based on that key function, and
> go to town.

Is it necessary to sort them? If XXX is known, then presumably the first 
file will be called XXX_chunk_0, the next XXX_chunk_1 and so on.

It would be possible to iterate over such a sequence of filenames, and 
keep opening them then writing them to the output until there are no 
more files. Or, if a list of matching files is obtained, the length of 
the list will also give you the last filename.

(But this won't work if there are gaps in the sequence or the numeric 
format is variable.)

-- 
bartc

-- 
https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Concatenating files in order

2017-05-25 Thread Mahmood Naderan via Python-list

Hi guys,


Cameron, thanks for the points. In fact the file name contains multiple '_' 
characters. So, I appreciate what you recommended.





  filenames = {}

  for name in glob.glob('*chunk_*'):

left, right = name.rsplit('_', 1)

if left.endswith('chunk') and right.isdigit():

  filenames[int(right)] = filename

  sorted_filenames = [ filenames[k] for k in sorted(filenames.keys()) ]




It seems that 'filename' should be 'right'.  



Regards,

Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Concatenating files in order

2017-05-26 Thread Mahmood Naderan via Python-list

Thank you very much. I understand that

Regards,
Mahmood

On Friday, May 26, 2017 5:01 AM, Cameron Simpson  wrote:

On 25May2017 20:37, Mahmood Naderan  wrote:

>Cameron, thanks for the points. In fact the file name contains multiple '_' 
>characters. So, I appreciate what you recommended.
>
>  filenames = {}
>  for name in glob.glob('*chunk_*'):
>left, right = name.rsplit('_', 1)
>if left.endswith('chunk') and right.isdigit():
>  filenames[int(right)] = filename
>  sorted_filenames = [ filenames[k] for k in sorted(filenames.keys()) ]
>
>It seems that 'filename' should be 'right'.

No, 'filename' should be 'name': the original filename. Thanks for the catch.

The idea is to have a map of int->filename so that you can open the files in 
numeric order.  So 'right' is just the numeric suffix - you need 'name' for the 
open() call.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list

embed a package for proper fun script

2017-05-29 Thread Mahmood Naderan via Python-list

Hello,
How it is possible to embed a package in my project? I mean, in my python 
script I have written

import openpyxl

So, the user may not have installed that package and doesn't understand what is 
pip!
Please let me know the instructions or any document regarding that.

 Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: embed a package for proper fun script

2017-05-30 Thread Mahmood Naderan via Python-list

No idea?...

 
Regards,
Mahmood


On Tuesday, May 30, 2017 1:06 AM, Mahmood Naderan via Python-list 
 wrote:



Hello,

How it is possible to embed a package in my project? I mean, in my python 
script I have written


import openpyxl


So, the user may not have installed that package and doesn't understand what is 
pip!

Please let me know the instructions or any document regarding that.


Regards,

Mahmood

-- 

https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Python not able to find package but it is installed

2017-05-30 Thread Mahmood Naderan via Python-list

Hello,
Although I have installed a package via pip on a centos-6.6, python interpreter 
still says there is no such package!

Please see the output below

$ python exread2.py input.xlsx tmp/output
Traceback (most recent call last):
File "/home/mahmood/excetest/exread2.py", line 1, in 
from openpyxl import load_workbook
ImportError: No module named openpyxl


$ pip install openpyxl
DEPRECATION: Python 2.6 is no longer supported by the Python core team, please 
upgrade your Python. A future version of pip will drop support for Python 2.6
Requirement already satisfied: openpyxl in 
/opt/rocks/lib/python2.6/site-packages

...


$ ls -l /opt/rocks/lib/python2.6/site-packages/openpyxl*
/opt/rocks/lib/python2.6/site-packages/openpyxl:
total 84
drwxr-xr-x 2 root root 4096 May 30 12:26 cell
drwxr-xr-x 2 root root 4096 May 30 12:26 chart
drwxr-xr-x 2 root root 4096 May 30 12:26 chartsheet
drwxr-xr-x 2 root root 4096 May 30 12:26 comments
drwxr-xr-x 2 root root 4096 May 30 12:26 compat
-rw-r--r-- 1 root root 1720 Mar 16 21:35 conftest.py
-rw-r--r-- 1 root root 2111 May 30 12:26 conftest.pyc
drwxr-xr-x 2 root root 4096 May 30 12:26 descriptors
drwxr-xr-x 2 root root 4096 May 30 12:26 drawing
drwxr-xr-x 2 root root 4096 May 30 12:26 formatting
drwxr-xr-x 2 root root 4096 May 30 12:26 formula
-rw-r--r-- 1 root root  880 Mar 16 21:35 __init__.py
-rw-r--r-- 1 root root 1009 May 30 12:26 __init__.pyc
drwxr-xr-x 2 root root 4096 May 30 12:26 packaging
drwxr-xr-x 2 root root 4096 May 30 12:26 reader
drwxr-xr-x 2 root root 4096 May 30 12:26 styles
drwxr-xr-x 2 root root 4096 May 30 12:26 utils
drwxr-xr-x 3 root root 4096 May 30 12:26 workbook
drwxr-xr-x 2 root root 4096 May 30 12:26 worksheet
drwxr-xr-x 2 root root 4096 May 30 12:26 writer
drwxr-xr-x 2 root root 4096 May 30 12:26 xml

/opt/rocks/lib/python2.6/site-packages/openpyxl-2.4.7-py2.6.egg-info:
total 36
-rw-r--r-- 1 root root 1 May 30 12:26 dependency_links.txt
-rw-r--r-- 1 root root 11193 May 30 12:26 installed-files.txt
-rw-r--r-- 1 root root  2381 May 30 12:26 PKG-INFO
-rw-r--r-- 1 root root16 May 30 12:26 requires.txt
-rw-r--r-- 1 root root  5224 May 30 12:26 SOURCES.txt
-rw-r--r-- 1 root root 9 May 30 12:26 top_level.txt




Any idea to fix that?

Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: embed a package for proper fun script

2017-05-30 Thread Mahmood Naderan via Python-list

Thanks. I will try and come back later.

 
Regards,
Mahmood


On Tuesday, May 30, 2017 2:03 PM, Paul Moore  wrote:



On Tuesday, 30 May 2017 08:48:34 UTC+1, Mahmood Naderan  wrote:

> No idea?...

> 

>  

> Regards,

> Mahmood

> 

> 

> On Tuesday, May 30, 2017 1:06 AM, Mahmood Naderan via Python-list 
>  wrote:

> 

> 

> 

> Hello,

> 

> How it is possible to embed a package in my project? I mean, in my python 
> script I have written

> 

> 

> import openpyxl

> 

> 

> So, the user may not have installed that package and doesn't understand what 
> is pip!

> 

> Please let me know the instructions or any document regarding that.


You might want to look at the zipapp module in the stdlib.

Paul

-- 

https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python not able to find package but it is installed

2017-05-30 Thread Mahmood Naderan via Python-list

Well, on rocks there exist multiple pythons. But by default the active is 2.6.6

$ python -V
Python 2.6.6


I have to say that the script doesn't modify sys.path. I only use sys.argv[] 
there

I can put all dependent modules in my project folder but that will be dirty.

 
Regards,
Mahmood


On Tuesday, May 30, 2017 2:09 PM, Wolfgang Maier 
 wrote:
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python not able to find package but it is installed

2017-05-30 Thread Mahmood Naderan via Python-list

Well yes. It looks in other folders

>>> import openpyxl
# trying openpyxl.so
# trying openpyxlmodule.so
# trying openpyxl.py
# trying openpyxl.pyc
# trying /usr/lib64/python2.6/openpyxl.so
# trying /usr/lib64/python2.6/openpyxlmodule.so
# trying /usr/lib64/python2.6/openpyxl.py
# trying /usr/lib64/python2.6/openpyxl.pyc
# trying /usr/lib64/python2.6/plat-linux2/openpyxl.so
# trying /usr/lib64/python2.6/plat-linux2/openpyxlmodule.so
# trying /usr/lib64/python2.6/plat-linux2/openpyxl.py
# trying /usr/lib64/python2.6/plat-linux2/openpyxl.pyc
# trying /usr/lib64/python2.6/lib-dynload/openpyxl.so
# trying /usr/lib64/python2.6/lib-dynload/openpyxlmodule.so
# trying /usr/lib64/python2.6/lib-dynload/openpyxl.py
# trying /usr/lib64/python2.6/lib-dynload/openpyxl.pyc
# trying /usr/lib64/python2.6/site-packages/openpyxl.so
# trying /usr/lib64/python2.6/site-packages/openpyxlmodule.so
# trying /usr/lib64/python2.6/site-packages/openpyxl.py
# trying /usr/lib64/python2.6/site-packages/openpyxl.pyc
# trying /usr/lib64/python2.6/site-packages/gtk-2.0/openpyxl.so
# trying /usr/lib64/python2.6/site-packages/gtk-2.0/openpyxlmodule.so
# trying /usr/lib64/python2.6/site-packages/gtk-2.0/openpyxl.py
# trying /usr/lib64/python2.6/site-packages/gtk-2.0/openpyxl.pyc
# trying /usr/lib/python2.6/site-packages/openpyxl.so
# trying /usr/lib/python2.6/site-packages/openpyxlmodule.so
# trying /usr/lib/python2.6/site-packages/openpyxl.py
# trying /usr/lib/python2.6/site-packages/openpyxl.pyc
Traceback (most recent call last):
File "", line 1, in 
ImportError: No module named openpyxl




But

$ find /opt -name openpyxl
/opt/rocks/lib/python2.6/site-packages/openpyxl



 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python not able to find package but it is installed

2017-05-31 Thread Mahmood Naderan via Python-list

Consider this output

[root@cluster ~]# pip --version
pip 9.0.1 from /opt/rocks/lib/python2.6/site-packages/pip-9.0.1-py2.6.egg 
(python 2.6)
[root@cluster ~]# easy_install --version
distribute 0.6.10
[root@cluster ~]# find /opt -name python
/opt/rocks/lib/graphviz/python
/opt/rocks/bin/python
/opt/rocks/usr/bin/python
/opt/python
/opt/python/bin/python
[root@cluster ~]# find /usr -name python
/usr/include/google/protobuf/compiler/python
/usr/bin/python
/usr/share/doc/m2crypto-0.20.2/demo/Zope/lib/python
/usr/share/doc/m2crypto-0.20.2/demo/ZopeX3/install_dir/lib/python
/usr/share/doc/m2crypto-0.20.2/demo/Zope27/install_dir/lib/python
/usr/share/gdb/python
/usr/share/swig/1.3.40/python
[root@cluster ~]# find /opt -name pip
/opt/rocks/lib/python2.6/site-packages/pip-9.0.1-py2.6.egg/pip
/opt/rocks/bin/pip
[root@cluster ~]# find /usr -name pip
[root@cluster ~]#




So, yes there are multiple versions of python and it seems that the search 
location of pip and python are different. I will try to modify the path to see 
what is what.
 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

openpyxl reads cell with format

2017-06-05 Thread Mahmood Naderan via Python-list

Hello guys...
With openpyxl, it seems that when the content of a cell is something like 
"4-Feb", then it is read as "2016-02-04 00:00:00" that looks like a calendar 
conversion.

How can I read the cell as text instead of such an automatic conversion?
 Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Openpyxl cell format

2017-06-05 Thread Mahmood Naderan via Python-list

Hello guys,

With openpyxl, it seems that when the content of a cell is something like 
"4-Feb", then it is read as "2016-02-04 00:00:00" that looks like a calendar 
conversion.

How can I read the cell as text instead of such an automatic conversion?
 Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Openpyxl cell format

2017-06-05 Thread Mahmood Naderan via Python-list

Maybe... But specifically in my case, the excel file is exported from a web 
page. I think there should be a way to read the content as a pure text.

 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: openpyxl reads cell with format

2017-06-05 Thread Mahmood Naderan via Python-list

>if the cell is an Excel date, it IS stored as a numeric

As I said, the "shape" of the cell is similar to date. The content which is 
"4-Feb" is not a date. It is a string which I expect from cell.value to read it 
as "4-Feb" and nothing else.

Also, I said before that the file is downloaded from a website. That means, 
from a button on a web page, I chose "export as excel" to download the data. I 
am pretty sure that auto format feature of the excel is trying to convert it as 
a date.


So, I am looking for a way to ignore such auto formatting.
 
Regards,
Mahmood
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: openpyxl reads cell with format

2017-06-05 Thread Mahmood Naderan via Python-list

OK thank you very much. As you said, it seems that it is too late for my python 
script.

 
Regards,
Mahmood


On Monday, June 5, 2017 10:41 PM, Dennis Lee Bieber  
wrote:



On Mon, 5 Jun 2017 14:46:18 + (UTC), Mahmood Naderan via Python-list

 declaimed the following:


>>if the cell is an Excel date, it IS stored as a numeric

>

>As I said, the "shape" of the cell is similar to date. The content which is 
>"4-Feb" is not a date. It is a string which I expect from cell.value to read 
>it as "4-Feb" and nothing else.

>

>Also, I said before that the file is downloaded from a website. That means, 
>from a button on a web page, I chose "export as excel" to download the data. I 
>am pretty sure that auto format feature of the excel is trying to convert it 
>as a date.

>


Then you'll have to modify the Excel file before the "export" to tell

IT that the column is plain text BEFORE the data goes into the column.


The normal behavior for Excel is: if something looks like a date

(partial or full) when entered, Excel will store it as a numeric "days from

epoch" and flag the cell as a "date" field. The visual representation is up

to the client -- as my sample table shows, the very same input value looks

different based upon how the column is defined.


>

>So, I am looking for a way to ignore such auto formatting.

>


By the time Python sees it, it is too late -- all you have is an

integer number tagged as a "date", and an import process that renders that

number as a Python datetime object (which you can then render however you

want

https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior

)

-- 

Wulfraed Dennis Lee Bieber AF6VN

[email protected]://wlfraed.home.netcom.com/


-- 

https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

52 matches

Mail list logo