<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Deep Learning | Tengku Hanis</title>
    <link>https://tengkuhanis.netlify.app/tag/deep-learning/</link>
      <atom:link href="https://tengkuhanis.netlify.app/tag/deep-learning/index.xml" rel="self" type="application/rss+xml" />
    <description>Deep Learning</description>
    <generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>©Tengku Hanis 2020-2025 Made with [blogdown](https://github.com/rstudio/blogdown)</copyright><lastBuildDate>Wed, 28 Dec 2022 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://tengkuhanis.netlify.app/images/icon_hua2ec155b4296a9c9791d015323e16eb5_11927_512x512_fill_lanczos_center_2.png</url>
      <title>Deep Learning</title>
      <link>https://tengkuhanis.netlify.app/tag/deep-learning/</link>
    </image>
    
    <item>
      <title>Visualising augmented images in Keras</title>
      <link>https://tengkuhanis.netlify.app/post/visualising-augmented-images-in-keras/</link>
      <pubDate>Wed, 28 Dec 2022 00:00:00 +0000</pubDate>
      <guid>https://tengkuhanis.netlify.app/post/visualising-augmented-images-in-keras/</guid>
      <description>


&lt;div id=&#34;data-augmentation&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Data augmentation&lt;/h2&gt;
&lt;p&gt;Data augmentation is been used in deep learning for many reasons. One of the reason is to reduce overfitting and makes the model more robust. Data augmentation can be done relatively easy in &lt;code&gt;keras&lt;/code&gt; package in R. However, I have not found any resources on how to visualise the augmented image in R except in Python. Visualising the augmented image can be quite useful to get an idea of how the image looks like. So, this post covers a simple to do this in R.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;r-code&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;R code&lt;/h2&gt;
&lt;p&gt;Let’s load the &lt;code&gt;keras&lt;/code&gt; library&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(keras)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Warning: package &amp;#39;keras&amp;#39; was built under R version 4.2.2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, we load the image from the internet.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;r_logo &amp;lt;- 
  get_file(&amp;quot;img&amp;quot;, &amp;quot;https://ih1.redbubble.net/image.522493300.6771/st,small,507x507-pad,600x600,f8f8f8.jpg&amp;quot;) %&amp;gt;% 
  image_load()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Our image right now is 600x 600 x 3. The 3 at the back because the image is coloured (RGB channels).&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;r_logo$size&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [1] 600
## 
## [[2]]
## [1] 600&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, we need to change the image into an array with the dimension of 1 x 600 x 600 x 3. The number 1 indicates we have only one image.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;r_logo &amp;lt;- 
  r_logo %&amp;gt;% 
  image_to_array() %&amp;gt;% 
  array_reshape(c(1, dim(.)))
dim(r_logo)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1]   1 600 600   3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once we have a correct dimension, we can specify the parameters for the data augmentation.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;augment_params &amp;lt;- image_data_generator(horizontal_flip = T, 
                                       vertical_flip = T,
                                       rotation_range = 0.5,
                                       zoom_range = 0.5,
                                       fill_mode = &amp;quot;reflect&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I am not going to into the details of the parameters. For those interested, the &lt;a href=&#34;https://tensorflow.rstudio.com/reference/keras/image_data_generator&#34;&gt;TensorFlow for R website&lt;/a&gt; explain this very well.&lt;/p&gt;
&lt;p&gt;Next, we can generate the batch of augmented data at random. This function, however, will only run once we fit the model.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;img_gen &amp;lt;- flow_images_from_data(r_logo,
                                 generator = augment_params, 
                                 batch_size = 1)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, we can plot the image. Firstly, this is our original image.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;img_gen$x [1,,,] %&amp;gt;% 
  as.raster(max = 255) %&amp;gt;% 
  as.array() %&amp;gt;% 
  plot()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://tengkuhanis.netlify.app/post/visualising-augmented-images-in-keras/index.en_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Now, we going to loop the augmentation process. Here, we going to generate six augmented images. The &lt;code&gt;set.seed&lt;/code&gt; for reproducibility.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;set.seed(123)
par(mfrow = c(3, 2), mar = c(1, 0, 1, 0))

for (i in 1:6) {
  IMG &amp;lt;- img_gen$`next`()
  IMG[1,,,] %&amp;gt;% as.raster(max = 255) %&amp;gt;% as.array() %&amp;gt;% plot()
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://tengkuhanis.netlify.app/post/visualising-augmented-images-in-keras/index.en_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;conclusion&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I believe this is quite useful to get a sense of how your data is augmented. Consequently, this may help in selecting the parameters for the data augmentation.&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Exponentially Weighted Average in Deep Learning</title>
      <link>https://tengkuhanis.netlify.app/post/exponentially-weighted-average-in-deep-learning/</link>
      <pubDate>Sun, 09 May 2021 00:00:00 +0000</pubDate>
      <guid>https://tengkuhanis.netlify.app/post/exponentially-weighted-average-in-deep-learning/</guid>
      <description>
&lt;script src=&#34;https://tengkuhanis.netlify.app/post/exponentially-weighted-average-in-deep-learning/index.en_files/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;


&lt;p&gt;I have been reading about lost functions and optimisers in deep learning for the last couple of days when I stumble upon the term Exponentially Weighted Average (EWA). So, in this post I aims to explain my understanding of EWA.&lt;/p&gt;
&lt;div id=&#34;overview-of-ewa&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overview of EWA&lt;/h2&gt;
&lt;p&gt;EWA basically is an important concept in deep learning and have been used in several optimisers to smoothen the noise of the data.&lt;/p&gt;
&lt;p&gt;Let’s see the formula for EWA:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;formula.png&#34; width=&#34;60%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;V&lt;sub&gt;t&lt;/sub&gt;&lt;/em&gt; is some smoothen value at point &lt;em&gt;t&lt;/em&gt;, while &lt;em&gt;S&lt;sub&gt;t&lt;/sub&gt;&lt;/em&gt; is a data point at point &lt;em&gt;t&lt;/em&gt;. &lt;em&gt;B&lt;/em&gt; here is a hyperparameter that we need to tune in our network. So, the choice of &lt;em&gt;B&lt;/em&gt; will determine how many data points that we average the value of &lt;em&gt;V&lt;sub&gt;t&lt;/sub&gt;&lt;/em&gt; as shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;beta.png&#34; width=&#34;80%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;ewa-in-deep-learnings-optimiser&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;EWA in deep learnings’ optimiser&lt;/h2&gt;
&lt;p&gt;So, some of the optimisers that adopt the approach of EWA are (red box indicates the EWA part in each formula):&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Stochastic gradient descent (SGD) with momentum&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The issue with SGD is the present of noise while searching for global minima. So, SGD with momentum integrated the EWA, which reduces these noises and helps the network converges faster.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;SGD-momentum2.png&#34; width=&#34;80%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;ol start=&#34;2&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Adaptive delta (Adadelta) and Root Mean Square Propagation (RMSprop)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Adadelta and RMSprop are proposed in attempt to solve the issue of diminishing learning rate of adaptive gradient (Adagrad) optimiser. The use of EWA in both optimisers actually helps to achieve this. Both optimisers have quite a similar formula, but attached below is the formula for Adadelta.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;adadelta2.png&#34; width=&#34;80%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;ol start=&#34;3&#34; style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Adaptive moment estimation (ADAM)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;ADAM basically combined the SGD with momentum with Adadelta. As shown earlier, both optimisers use EWA.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;more-details-on-ewa&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;More details on EWA&lt;/h2&gt;
&lt;p&gt;Now, let’s go back to EWA. Here is the example of calculation of EWA:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;seq1.png&#34; width=&#34;90%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Keep in mind that &lt;em&gt;t&lt;sub&gt;3&lt;/sub&gt;&lt;/em&gt; is the latest time point, followed by &lt;em&gt;t&lt;sub&gt;2&lt;/sub&gt;&lt;/em&gt; and &lt;em&gt;t&lt;sub&gt;1&lt;/sub&gt;&lt;/em&gt;, respectively. So, if we want to calculate &lt;em&gt;V&lt;sub&gt;3&lt;/sub&gt;&lt;/em&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;seq2.png&#34; width=&#34;90%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;So, if we were to varies the value of &lt;em&gt;B&lt;/em&gt; across the equation (while the values of &lt;em&gt;a&lt;sub&gt;1&lt;/sub&gt;…a&lt;sub&gt;n&lt;/sub&gt;&lt;/em&gt; remain constant), we can do so in R.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(tidyverse) 

func &amp;lt;- function(b) (1 - b) * b^((20:1) - 1)
beta &amp;lt;- seq(0.1, 0.9, by=0.2)

dat &amp;lt;- t(sapply(beta, func)) %&amp;gt;% 
  as.data.frame()
colnames(dat)[1:20] &amp;lt;- 1:20

dat %&amp;gt;%  
  mutate(beta = as_factor(beta)) %&amp;gt;%
  pivot_longer(cols = 1:20, names_to = &amp;quot;data_point&amp;quot;, values_to = &amp;quot;weight&amp;quot;) %&amp;gt;% 
  ggplot(aes(x=as.numeric(data_point), y=weight, color=beta)) +
  geom_line() +
  geom_point() +
  scale_x_continuous(breaks = 1:20) +
  labs(title = &amp;quot;Change of Exponentially Weighted Average function&amp;quot;, 
       subtitle = &amp;quot;Time at t20 is the recent time, and t1 is the initial time&amp;quot;) +
  scale_colour_discrete(&amp;quot;Beta:&amp;quot;) +
  xlab(&amp;quot;Time(t)&amp;quot;) +
  ylab(&amp;quot;Weights/Coefficients&amp;quot;) +
  theme_bw()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://tengkuhanis.netlify.app/post/exponentially-weighted-average-in-deep-learning/index.en_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Note that time at t&lt;sub&gt;20&lt;/sub&gt; is the recent time, and t&lt;sub&gt;1&lt;/sub&gt; is the initial time. Thus, two main points from the above plot are:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The EWA function acts in a decaying manner.&lt;br /&gt;
&lt;/li&gt;
&lt;li&gt;As beta, &lt;em&gt;B&lt;/em&gt; increases we actually put more emphasize on the recent data point.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;Side note: I have tried to do the plot in plotly, not sure why it did not work&lt;/em&gt; 😕&lt;/p&gt;
&lt;p&gt;References:&lt;br /&gt;
1) &lt;a href=&#34;https://towardsdatascience.com/deep-learning-optimizers-436171c9e23f&#34; class=&#34;uri&#34;&gt;https://towardsdatascience.com/deep-learning-optimizers-436171c9e23f&lt;/a&gt; (all the equations are from this reference)&lt;br /&gt;
2) &lt;a href=&#34;https://youtu.be/NxTFlzBjS-4&#34; class=&#34;uri&#34;&gt;https://youtu.be/NxTFlzBjS-4&lt;/a&gt;&lt;br /&gt;
3) &lt;a href=&#34;https://medium.com/@dhartidhami/exponentially-weighted-averages-5de212b5be46&#34; class=&#34;uri&#34;&gt;https://medium.com/@dhartidhami/exponentially-weighted-averages-5de212b5be46&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
