Neural Network Demonstration
Interactive Deep Learning Model with 5 Layers: 3 Input → 4 Hidden → 3 Hidden → 2 Hidden → 2 Output Nodes
Network Architecture
A simple feedforward neural network with adjustable parameters
Input Layer
Hidden Layer 1
Hidden Layer 2
Hidden Layer 3
Output Layer
Input Values
Target Values
Mathematical Formulas & Calculations
Real-time view of all mathematical operations
Activation Function (Sigmoid)
Formula: σ(x) = 1 / (1 + e^(-x))
This function maps any real number to a value between 0 and 1
Current Bias Values
Hidden Layer 1:
H1-1: 0.100
H1-2: -0.200
H1-3: 0.300
H1-4: 0.000
Hidden Layer 2:
H2-1: 0.200
H2-2: -0.100
H2-3: 0.400
Hidden Layer 3:
H3-1: 0.100
H3-2: -0.300
Output Layer:
Out-1: 0.200
Out-2: -0.100
Hidden Layer 1 Calculations
H1-1:
sum = 0.50 × 0.40 + 0.30 × 0.20 + 0.80 × 0.60 + bias
sum = 0.7400 + 0.100 = 0.8400
σ(0.8400) = 0.0000
H1-2:
sum = 0.50 × 0.70 + 0.30 × -0.30 + 0.80 × 0.10 + bias
sum = 0.3400 + -0.200 = 0.1400
σ(0.1400) = 0.0000
H1-3:
sum = 0.50 × -0.20 + 0.30 × 0.80 + 0.80 × -0.40 + bias
sum = -0.1800 + 0.300 = 0.1200
σ(0.1200) = 0.0000
H1-4:
sum = 0.50 × 0.50 + 0.30 × 0.10 + 0.80 × 0.90 + bias
sum = 1.0000 + 0.000 = 1.0000
σ(1.0000) = 0.0000
Hidden Layer 2 Calculations
H2-1:
sum = 0.00 × 0.30 + 0.00 × 0.80 + 0.00 × -0.10 + 0.00 × 0.50 + bias
sum = 0.0000 + 0.200 = 0.2000
σ(0.2000) = 0.0000
H2-2:
sum = 0.00 × -0.50 + 0.00 × 0.20 + 0.00 × 0.90 + 0.00 × -0.30 + bias
sum = 0.0000 + -0.100 = -0.1000
σ(-0.1000) = 0.0000
H2-3:
sum = 0.00 × 0.70 + 0.00 × -0.60 + 0.00 × 0.40 + 0.00 × 0.10 + bias
sum = 0.0000 + 0.400 = 0.4000
σ(0.4000) = 0.0000
Hidden Layer 3 Calculations
H3-1:
sum = 0.00 × 0.40 + 0.00 × 0.70 + 0.00 × -0.20 + bias
sum = 0.0000 + 0.100 = 0.1000
σ(0.1000) = 0.0000
H3-2:
sum = 0.00 × -0.60 + 0.00 × 0.30 + 0.00 × 0.90 + bias
sum = 0.0000 + -0.300 = -0.3000
σ(-0.3000) = 0.0000
Output Layer Calculations
Output 1:
sum = 0.00 × 0.60 + 0.00 × 0.20 + bias
sum = 0.0000 + 0.200 = 0.2000
σ(0.2000) = 0.0000
Output 2:
sum = 0.00 × -0.40 + 0.00 × 0.80 + bias
sum = 0.0000 + -0.100 = -0.1000
σ(-0.1000) = 0.0000
Loss Calculation (Mean Squared Error)
Formula: Loss = 0.5 × Σ(predicted - target)²
Loss = 0.5 × [(0.0000 - 0.80)² + (0.0000 - 0.20)²]
Loss = 0.5 × [(-0.8000)² + (-0.2000)²]
Loss = 0.5 × [0.640000 + 0.040000]
Loss = 0.000000
Backpropagation: Loss Gradients for Each Layer
Output Layer Gradients
Output 1:
Error = predicted - target = 0.0000 - 0.80 = -0.8000
Sigmoid'(x) = output × (1 - output) = 0.0000 × 1.0000 = 0.0000
Gradient = -0.8000 × 0.0000 = 0.000000
Output 2:
Error = predicted - target = 0.0000 - 0.20 = -0.2000
Sigmoid'(x) = output × (1 - output) = 0.0000 × 1.0000 = 0.0000
Gradient = -0.2000 × 0.0000 = 0.000000
Hidden Layer 3 Gradients
H3-1:
Weighted error sum = (-0.800 × 0.000 × 0.60) + (-0.200 × 0.000 × -0.40)
Weighted error sum = 0.000000
Sigmoid'(x) = 0.0000 × 1.0000 = 0.0000
Gradient = 0.000000 × 0.0000 = 0.000000
H3-2:
Weighted error sum = (-0.800 × 0.000 × 0.20) + (-0.200 × 0.000 × 0.80)
Weighted error sum = 0.000000
Sigmoid'(x) = 0.0000 × 1.0000 = 0.0000
Gradient = 0.000000 × 0.0000 = 0.000000
Hidden Layer 2 Gradients
H2-1:
Propagated from H3 layer (simplified)
Weighted error sum = 0.000000
Sigmoid'(x) = 0.0000
Gradient = 0.000000
H2-2:
Propagated from H3 layer (simplified)
Weighted error sum = 0.000000
Sigmoid'(x) = 0.0000
Gradient = 0.000000
H2-3:
Propagated from H3 layer (simplified)
Weighted error sum = 0.000000
Sigmoid'(x) = 0.0000
Gradient = 0.000000
Hidden Layer 1 Gradients
H1-1:
Propagated from H2 layer (simplified)
Sigmoid'(x) = 0.0000
Simplified Gradient = 0.000000
H1-2:
Propagated from H2 layer (simplified)
Sigmoid'(x) = 0.0000
Simplified Gradient = 0.000000
H1-3:
Propagated from H2 layer (simplified)
Sigmoid'(x) = 0.0000
Simplified Gradient = 0.000000
H1-4:
Propagated from H2 layer (simplified)
Sigmoid'(x) = 0.0000
Simplified Gradient = 0.000000
Weight Update Examples
Hidden Layer 3 → Output Weight Update:
For weight H3-1 → Output-1:
Current weight: 0.6000
Update = learning_rate × output_gradient × h3_activation
Update = 0.10 × 0.000000 × 0.0000
Update = 0.000000
New weight = 0.6000 - 0.000000
Backpropagation Formula Summary
Chain Rule Application:
∂Loss/∂weight = ∂Loss/∂output × ∂output/∂sum × ∂sum/∂weight
Where:
• ∂Loss/∂output = output - target (for MSE)
• ∂output/∂sum = σ'(sum) = output × (1 - output)
• ∂sum/∂weight = previous_layer_activation
Weight Update: w_new = w_old - α × gradient
Where α = learning rate
Understanding the Network
How this deep neural network works
Network Architecture:
- Input Layer: Three nodes that receive input values (blue circles)
- Hidden Layer 1: Four nodes that process inputs from the input layer (purple circles)
- Hidden Layer 2: Three nodes that further process information (purple circles)
- Hidden Layer 3: Two nodes that create final feature representations (purple circles)
- Output Layer: Two nodes that produce the final predictions (green circles)
- Node Values: Each circle shows the current activation value after applying the sigmoid function
- Bias Terms: Each layer (except input) has bias values that are added to the weighted sum before activation
Mathematical Formula with Bias:
Complete Neuron Calculation:
output = σ(Σ(input_i × weight_i) + bias)
Where:
- σ = sigmoid activation function
- input_i = value from previous layer node i
- weight_i = connection weight from node i
- bias = learnable bias parameter for this neuron
How Information Flows:
- Forward Pass: Information flows from input → hidden layer 1 → hidden layer 2 → hidden layer 3 → output
- Activation Function: Each layer uses the sigmoid function to introduce non-linearity
- Feature Learning: Each hidden layer learns increasingly complex and abstract features
- Deep Architecture: Five layers allow for very sophisticated hierarchical representations
Training Process:
- The network performs a forward pass through all 5 layers to generate predictions
- Error is calculated by comparing final outputs with target values
- Backpropagation adjusts weights layer by layer, working backwards from output to input
- Each layer learns different levels of abstraction during this deep learning process
Deep Learning Architecture: This 5-layer network demonstrates true deep learning! Notice how information is progressively transformed through multiple hidden layers (4→3→2 nodes), allowing the network to learn complex, hierarchical representations of the input data.
Experiment: Watch how values change across all layers as you adjust inputs. During training, observe how the deeper architecture allows for more sophisticated learning patterns compared to shallow networks.