-
plotly를 이용한 데이터 시각화plotly 시각화 2021. 11. 29. 11:32
plotly plotly 를 이용한 데이터 시각화¶
파이썬 판다스 시각화 라이브러리 matplotlib, seaborn 등이 있지만 plotly는 그래프는 그래프를 마우스 위로 가져가면 정보가 나온다. 기본 라이브러리보다 보기 더 편한것을 알수 있다. 그리고 일단 그래프가 깔끔하고 보기 좋다는 장점도 있다. 기존 판다스는 D3js등을 이용해 추가 작업을 해야 한다 그러나 plotly를 이용하면 더욱 간편하게 깔끔하고 보기 좋은 차트를 만들 수 있다는 큰 장점이 있다고 본다. 많은 차트가 있고 많은 설명이 필요하지만 지금은 plotly의 차트가 어떤지 한번 보도록 하자.지금부터 기본적인 plotly를 이용해보자.
좀더 자세한 사항은 링크를 참고하자.1.plotly¶
1-1 라이브러리 불러오기¶
In [1]:import pandas as pd import numpy as np import chart_studio.plotly as py import cufflinks as cf cf.go_offline(connected=True) # 온라인이 아닌 상황에서 처리를 위한 명시 코드
1-2 차트를 그리기 위한 난수 생성¶
In [2]:df = pd.DataFrame(np.random.rand(10, 4), columns=['A', 'B', 'C', 'D',]) df
Out[2]:A B C D 0 0.980952 0.051831 0.307102 0.545472 1 0.663816 0.109954 0.307368 0.645321 2 0.187076 0.128564 0.108641 0.459621 3 0.069446 0.047134 0.694410 0.228094 4 0.350516 0.309011 0.644628 0.796751 5 0.441577 0.521666 0.078167 0.665104 6 0.902608 0.569566 0.297261 0.539314 7 0.700498 0.143882 0.793970 0.701236 8 0.097717 0.667448 0.574982 0.598981 9 0.210314 0.145670 0.646448 0.069164 1-3 차트 확인¶
In [3]:cf.help()
Use 'cufflinks.help(figure)' to see the list of available parameters for the given figure. Use 'DataFrame.iplot(kind=figure)' to plot the respective figure Figures: bar box bubble bubble3d candle choroplet distplot heatmap histogram ohlc pie ratio scatter scatter3d scattergeo spread surface violin
1-4 Bra Chart¶
bar chart 세부 정보를 보면 Bar mode에 group,stack,overlay 모델을 쓸 수 있다고 자세하게 나와있다. 그럼 직접 그려보고 확인해보자.
1-4-1 bar 세부정보¶
In [4]:cf.help('bar') # 세부 정보 확인
BAR Bar Chart Supports categories and horizontal bar charts Parameters: =========== bargap : float Sets the gap between bars [0,1) bargroupgap : float Sets the gap between groups [0,1) barmode : string Bar mode group stack overlay categories : string Name of the column that contains the categories orientation : string Sets the orientation of the bars. h v sortbars : bool Sort bars in descending order colors : dict, list or string Trace color string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) colorscale : string Color scale name If the color is preceded by a minus (-) then the scale is inversed. Only valid if 'colors' is null. see cufflinks.colors.scales() for all available scales data : figure Plotly Figure rangeslider : bool or dict Defines if a range slider is displayed. If True : displays a range slider If dict : defines a range slider object Example: {'bgcolor':('blue',.3),'autorange':True} text : string Name of the column that contains the text values width : int, list or dict Line width int : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe ERROR BARS error_color : string Color for error bars error_opacity : float [0,1] Color opacity for the error bars error_thickness : float Sets the line thickness of the error bars error_trace : string Name of the DataFrame column for which error should be plotted. If omitted then errors apply to all traces. error_type : string Type of error bars data constant percent sqrt continuous continuous_percent error_values_minus : int, float or list Values corresponding to the span of the error bars below the trace coordinate error_width : int or float Sets the width (in pixels) of the cross-bar at both ends of the error bars error_x : int, float or list Error values for the x axis error_y : int, float or list Error values for the y axis LAYOUT layout : Plotly Layout If defined, this Layout is explicitly used for the Figure generation dimensions : tuple Dimensions for image/chart (width,height) fontfamily : string HTML Font typeface that will be applied It needs to exist on the system on which it operates. Examples: 'Times New Roman' 'Open Sans' 'Monospace' gridcolor : string Sets the grid color colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) legend : string Defines where the legend should appear Values: bottom top margin : dict or tuple Sets the margin dimensions {'l':left,'r':right,'b':bottom,'t':top} (left,right,bottom,top) secondary_y : string Name of the column(s) to be charted on the secondary axis secondary_y_title : string Sets the title of the secondary axis showlegend : bool Defines if the legend should appear theme : string Layout theme solar pearl white see cufflinks.getThemes() for all available themes title : string Chart title xTitle : string X Axis Title yTitle : string Y Axis Title zerolinecolor : string Sets the zero line color colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) layout_update : dict The Layout will be explicitly modified with the values stated in the dictionary. Not valid when Layout is passed as a parameter ANNOTATIONS annotations : dict Dictionary of annotations {x_point : text} fontcolor : string Text color fontsize : int Text size textangle : int Text angle EXPORTS asFigure : bool If True then it returns a Plotly Figure asImage : bool If True then it returns an image (PNG) While in ONLINE mode: Image file is saved in the working directory Accepts: filename dimensions scale display_image While in OFFLINE mode: Image file is downloaded (downloads folder) and a regular plotly chart is displayed in Jupyter Accepts: filename dimensions asPlot : bool If True then the chart opens in a browser asURL : bool If True the chart url/path is returned. No chart is displayed. If ONLINE : The URL is returned If OFFLINE : the local path is returned display_image : bool If True, then the image is displayed after being saved. Only valid if 'asImage=True' filename : string Filename to be saved as online : bool If True then the chart/image is rendered on the server even when running in Offline mode scale : int Increase the resolution of the image by `scale` amount Only valid if 'asImage=True' sharing : string Sets the sharing level permission public - anyone can see the chart private - only you can see this chart secret - only people with the link can see the chart SHAPES hline : float, list or dict Draws a horizontal line at the indicated 'y' position(s). Extra parameters can be passed in the form of a dictionary (see 'shapes') hline=4 hline=[2,10] hline=[{'y':2,'color':'blue'},{'y':3,'color':'red'}] hspan : tuple, list or dict Draws a horizontal rectangle at the indicated (y0,y1) positions. Extra parameters can be passed in the form of a dictionary (see 'shapes') hspan=(1,5) hspan=[(1,4),(6,10)] hspan=[{'y0':2,'y1':5,'color':'blue','fill':True,'opacity':.4}] vline : float, list or dict Draws a vertical line at the indicated 'x' position(s). Extra parameters can be passed in the form of a dictionary (see 'shapes') vline=4 vline=[2,10] vline=[{'x':'2015-02-08','color':'blue'},{'x':'2015-03-08','color':'red'}] vspan : tuple, list or dict Draws a vertical rectangle at the indicated (x0,x1) positions. Extra parameters can be passed in the form of a dictionary (see 'shapes') vspan=('2015-02-08','2015-03-08') vspan=[(1,4),(6,10)] vspan=[{'x0':2,'x1':5,'color':'blue','fill':True,'opacity':.4}] shapes : list or dict List of dictionaries with the specification of a given shape. For more information see help(cufflinks.tools.get_shape) SUBPLOTS horizontal_spacing : float [0-1] Space between subplot columns shape : (int,int) Indicates the size of rows and columns. If ommitted, then the shape is automatically set * Only valid if subplots=True (rows,columns) shared_xaxes : bool If True, subplots in the same grid column have one common shared x-axis at the bottom of the grid. shared_yaxes : bool If True, subplots in the same grid row have one common shared y-axis at the left of the grid. subplot_titles : bool If True, chart titles are displayed at the top of each subplot. subplots : bool If True then each trace is placed in a subplot vertical_spacing : float [0-1] Space between subplot rows AXIS logx : bool Sets the x axis to be of logarithmic scale logy : bool Sets the y axis to be of logarithmic scale logz : bool Sets the z axis to be of logarithmic scale xrange : tuple Sets the range for the x axis (lower_bound,upper_bound) yrange : tuple Sets the range for the y axis (lower_bound,upper_bound) EXAMPLES >> cf.datagen.bars(10,5).iplot(kind='bar') >> cf.datagen.bars().iplot(kind='bar',orientation='h')
In [5]:df.iplot(kind='bar')
1-4-2 bar chart , mode = group¶
In [6]:df.iplot(kind='bar', barmode = "group")
1-4-2 bar chart , mode = stack¶
In [7]:df.iplot(kind='bar',barmode='stack')
1-4-3 bar chart , mode = overlay¶
In [8]:df.iplot(kind='bar', barmode = "overlay")
1-4-4 bar chart, orientation="h","v"(방향)¶
In [9]:df.iplot(kind='bar' , orientation = "h")
1-5 scatter,line chart¶
scatter, line 두 차트는 같다고 보인다. scatter 대신 line로 그려도 동일한 차트가 그려진다. 그럼 눈으로 확인 하자
In [10]:cf.help('scatter') #cf.help('line') # acatter 세부사항과 같이 나옴.
SCATTER Scatter plot 2D chart based in an x and y axis. Can be a line chart (default) or a set of scatter points (markers) Parameters: =========== bestfit : bool or list Displays a best fit line If list, then a best fit line will be generated for each trace key in the list bestfit_colors : dict or list Sets the color for each best fit line {key:color} to set the color for each trace [color1, color2...] to set the colors in the specified order categories : string Name of the column that contains the categories connectgaps : bool If True, then empty values are connected dash : string, list or dict Line style string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values : solid dash dashdot dot fill : bool Fills the trace (area) interpolation : string, list or dict Interpolation over missing empty line points string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values linear spline vhv hvh vh hv mode : string, list or dict Plotting mode for scatter trace string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values : lines markers lines+text markers+text lines+markers+text size : int, string Size of symbol when mode='markers' symbol : string, list or dict Symbol used when mode='markers' string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values circle cross diamond square triangle-down See all possible values in 'help(cufflinks.scatter.Marker())' x : string Name of the column that contains the x axis value y : string Name of the column that contains the y axis value rangeselector : dict Defines a range selector object. See 'help(cufflinks.tools.get_range_selector)' for more information. Example: {'steps':['1y','2 months','ytd','2mtd'], 'axis':'xaxis1', 'bgcolor' : ('blue',.3), 'x': 0.2 , 'y' : 0.9} colors : dict, list or string Trace color string : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe values colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) colorscale : string Color scale name If the color is preceded by a minus (-) then the scale is inversed. Only valid if 'colors' is null. see cufflinks.colors.scales() for all available scales data : figure Plotly Figure rangeslider : bool or dict Defines if a range slider is displayed. If True : displays a range slider If dict : defines a range slider object Example: {'bgcolor':('blue',.3),'autorange':True} text : string Name of the column that contains the text values width : int, list or dict Line width int : applies to all traces list : applies to each trace in the order specified dict : {column:value} for each column in the dataframe ERROR BARS error_color : string Color for error bars error_opacity : float [0,1] Color opacity for the error bars error_thickness : float Sets the line thickness of the error bars error_trace : string Name of the DataFrame column for which error should be plotted. If omitted then errors apply to all traces. error_type : string Type of error bars data constant percent sqrt continuous continuous_percent error_values_minus : int, float or list Values corresponding to the span of the error bars below the trace coordinate error_width : int or float Sets the width (in pixels) of the cross-bar at both ends of the error bars error_x : int, float or list Error values for the x axis error_y : int, float or list Error values for the y axis LAYOUT layout : Plotly Layout If defined, this Layout is explicitly used for the Figure generation dimensions : tuple Dimensions for image/chart (width,height) fontfamily : string HTML Font typeface that will be applied It needs to exist on the system on which it operates. Examples: 'Times New Roman' 'Open Sans' 'Monospace' gridcolor : string Sets the grid color colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) legend : string Defines where the legend should appear Values: bottom top margin : dict or tuple Sets the margin dimensions {'l':left,'r':right,'b':bottom,'t':top} (left,right,bottom,top) secondary_y : string Name of the column(s) to be charted on the secondary axis secondary_y_title : string Sets the title of the secondary axis showlegend : bool Defines if the legend should appear theme : string Layout theme solar pearl white see cufflinks.getThemes() for all available themes title : string Chart title xTitle : string X Axis Title yTitle : string Y Axis Title zerolinecolor : string Sets the zero line color colorname : see cufflinks.colors.cnames hex : '#ffffff' rgb : 'rgb(23,50,23)' rgba : 'rgba(23,50,23,.5) layout_update : dict The Layout will be explicitly modified with the values stated in the dictionary. Not valid when Layout is passed as a parameter ANNOTATIONS annotations : dict Dictionary of annotations {x_point : text} fontcolor : string Text color fontsize : int Text size textangle : int Text angle EXPORTS asFigure : bool If True then it returns a Plotly Figure asImage : bool If True then it returns an image (PNG) While in ONLINE mode: Image file is saved in the working directory Accepts: filename dimensions scale display_image While in OFFLINE mode: Image file is downloaded (downloads folder) and a regular plotly chart is displayed in Jupyter Accepts: filename dimensions asPlot : bool If True then the chart opens in a browser asURL : bool If True the chart url/path is returned. No chart is displayed. If ONLINE : The URL is returned If OFFLINE : the local path is returned display_image : bool If True, then the image is displayed after being saved. Only valid if 'asImage=True' filename : string Filename to be saved as online : bool If True then the chart/image is rendered on the server even when running in Offline mode scale : int Increase the resolution of the image by `scale` amount Only valid if 'asImage=True' sharing : string Sets the sharing level permission public - anyone can see the chart private - only you can see this chart secret - only people with the link can see the chart SHAPES hline : float, list or dict Draws a horizontal line at the indicated 'y' position(s). Extra parameters can be passed in the form of a dictionary (see 'shapes') hline=4 hline=[2,10] hline=[{'y':2,'color':'blue'},{'y':3,'color':'red'}] hspan : tuple, list or dict Draws a horizontal rectangle at the indicated (y0,y1) positions. Extra parameters can be passed in the form of a dictionary (see 'shapes') hspan=(1,5) hspan=[(1,4),(6,10)] hspan=[{'y0':2,'y1':5,'color':'blue','fill':True,'opacity':.4}] vline : float, list or dict Draws a vertical line at the indicated 'x' position(s). Extra parameters can be passed in the form of a dictionary (see 'shapes') vline=4 vline=[2,10] vline=[{'x':'2015-02-08','color':'blue'},{'x':'2015-03-08','color':'red'}] vspan : tuple, list or dict Draws a vertical rectangle at the indicated (x0,x1) positions. Extra parameters can be passed in the form of a dictionary (see 'shapes') vspan=('2015-02-08','2015-03-08') vspan=[(1,4),(6,10)] vspan=[{'x0':2,'x1':5,'color':'blue','fill':True,'opacity':.4}] shapes : list or dict List of dictionaries with the specification of a given shape. For more information see help(cufflinks.tools.get_shape) SUBPLOTS horizontal_spacing : float [0-1] Space between subplot columns shape : (int,int) Indicates the size of rows and columns. If ommitted, then the shape is automatically set * Only valid if subplots=True (rows,columns) shared_xaxes : bool If True, subplots in the same grid column have one common shared x-axis at the bottom of the grid. shared_yaxes : bool If True, subplots in the same grid row have one common shared y-axis at the left of the grid. subplot_titles : bool If True, chart titles are displayed at the top of each subplot. subplots : bool If True then each trace is placed in a subplot vertical_spacing : float [0-1] Space between subplot rows AXIS logx : bool Sets the x axis to be of logarithmic scale logy : bool Sets the y axis to be of logarithmic scale logz : bool Sets the z axis to be of logarithmic scale xrange : tuple Sets the range for the x axis (lower_bound,upper_bound) yrange : tuple Sets the range for the y axis (lower_bound,upper_bound) EXAMPLES >> cf.datagen.lines(1).iplot(kind='scatter') >> cf.datagen.scatter().iplot(kind='scatter',x='x',y='y',categories='categories',mode='markers') >> cf.datagen.lines(1).iplot(kind='scatter',error_type='continuous_percent',error_y=15)
1-5-1 scatter chart¶
In [11]:df.iplot(kind='scatter')
1-5-2 line chart¶
In [12]:df.iplot(kind = 'line')
1-5-3 scatter chart, mode = lines¶
In [13]:df.iplot(kind='scatter', mode = 'lines')
1-5-3 scatter chart, mode = lines¶
In [14]:df.iplot(kind = "scatter", mode = 'markers')
1-5-4 scatter chart, mode = markers + lines¶
In [15]:df.iplot(kind = "scatter", mode = 'markers + lines')
1-5-5 scatter chart, fill = True (선 채우기)¶
In [16]:df.iplot(kind = 'scatter', fill = True)
2.Plot Styling¶
2-1 chart 타이틀 설정¶
In [17]:df.iplot(kind = 'bar', xTitle = "X title", yTitle = "Y title", title = "title")
2-2 세부 요소 변경¶
In [18]:layout = { 'title': { 'text':'TEST', # 메인 타이틀 'font':{ 'family':'roboto', # 차트 타이틀 폰트 종류 'size':25 # 차트 타이틀 폰트 크기 }, 'x':0.5, # x축 기준으로(0~1) 타이틀 위치 'y':0.9 # y축 기준으로(0~1) 타이틀 위치 }, 'xaxis':{ 'showticklabels':True, # x축 레이블 표시 'dtick':1, # x축 레이블 간격 'title':{ 'text':'X axis', # x 축 타이틀 이름 'font':{ 'size':10, # x축 타이틀 크기 } } }, 'yaxis':{ 'showticklabels':False, # y축 레이블 표시 'dtick':0.1, #y축 레이블 간격 'title':{ 'text':'Y axis', # y 축 타이틀 이름 'font':{ 'size':10 # x축 타이틀 크기 } } } } df.iplot(kind = 'bar', layout = layout)
3. 차트 테마¶
3-1 테마 확인 및 적용¶
In [19]:cf.getThemes()
Out[19]:['ggplot', 'pearl', 'solar', 'space', 'white', 'polar', 'henanigans']
3-1-1 theme = ggplot¶
In [20]:df.iplot(kind = 'bar' ,theme='ggplot')
3-1-2 theme = pearl¶
In [21]:df.iplot(kind = 'bar' ,theme='pearl')
3-1-3 theme = solar¶
In [22]:df.iplot(kind = 'bar' ,theme='solar')
3-1-4 theme = space¶
In [23]:df.iplot(kind = 'bar' ,theme='space')
3-1-5 theme = white¶
In [24]:df.iplot(kind = 'bar' ,theme='white')
3-1-6 theme = polar¶
In [25]:df.iplot(kind = 'bar' ,theme='polar')
3-1-7 theme = henanigans¶
In [27]:df.iplot(kind = 'bar' ,theme='henanigans')
plotly를 사용해보니 기존 라이브러리랑 쓰는것은 비슷하지만 차트들이 더욱 보기좋은 점이 있다. 조금더 써보고 익숙해지면 장/단점이 확실하게 보일 것이다. 지금으로서는 plotly가 장점이 많아 보인다. 익숙해지려면 조금 시간이 걸리겠지만 한번 이용해보도록 하자. 다음은 Plotly express를 사용해 남겨보도록 하자.
'plotly 시각화' 카테고리의 다른 글
Plotly Express 이용한 데이터 시각화 (0) 2021.12.02